talk-data.com talk-data.com

Topic

data

3406

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
Azure Data Factory by Example: Practical Implementation for Data Engineers

Data engineers who need to hit the ground running will use this book to build skills in Azure Data Factory v2 (ADF). The tutorial-first approach to ADF taken in this book gets you working from the first chapter, explaining key ideas naturally as you encounter them. From creating your first data factory to building complex, metadata-driven nested pipelines, the book guides you through essential concepts in Microsoft’s cloud-based ETL/ELT platform. It introduces components indispensable for the movement and transformation of data in the cloud. Then it demonstrates the tools necessary to orchestrate, monitor, and manage those components. This edition, updated for 2024, includes the latest developments to the Azure Data Factory service: Enhancements to existing pipeline activities such as Execute Pipeline, along with the introduction of new activities such as Script, and activities designed specifically to interact with Azure Synapse Analytics. Improvements to flow control provided by activity deactivation and the Fail activity. The introduction of reusable data flow components such as user-defined functions and flowlets. Extensions to integration runtime capabilities including Managed VNet support. The ability to trigger pipelines in response to custom events. Tools for implementing boilerplate processes such as change data capture and metadata-driven data copying. What You Will Learn Create pipelines, activities, datasets, and linked services Build reusable components using variables, parameters, and expressions Move data into and around Azure services automatically Transform data natively using ADF data flows and Power Query data wrangling Master flow-of-control and triggers for tightly orchestrated pipeline execution Publish and monitor pipelines easily and with confidence Who This Book Is For Data engineers and ETL developers taking their first steps in Azure Data Factory, SQL Server Integration Services users making the transition toward doing ETL in Microsoft’s Azure cloud, and SQL Server database administrators involved in data warehousing and ETL operations

IBM GDPS: An Introduction to Concepts and Capabilities

This IBM Redbooks® publication presents an overview of the IBM Geographically Dispersed Parallel Sysplex® (IBM GDPS®) offerings and the roles they play in delivering a business IT resilience solution. The book begins with general concepts of business IT resilience and disaster recovery (DR), along with issues that are related to high application availability, data integrity, and performance. These topics are considered within the framework of government regulation, increasing application and infrastructure complexity, and the competitive and rapidly changing modern business environment. Next, it describes the GDPS family of offerings with specific reference to how they can help you achieve your defined goals for high availability and disaster recovery (HADR). Also covered are the features that simplify and enhance data replication activities, the prerequisites for implementing each offering, and tips for planning for the future and immediate business requirements. Tables provide easy-to-use summaries and comparisons of the offerings. The extra planning and implementation services available from IBM® also are explained. Then, several practical client scenarios and requirements are described, along with the most suitable GDPS solution for each case. The introductory chapters of this publication are intended for a broad technical audience, including IT System Architects, Availability Managers, Technical IT Managers, Operations Managers, System Programmers, and Disaster Recovery Planners. The subsequent chapters provide more technical details about the GDPS offerings, and each can be read independently for those readers who are interested in specific topics. Therefore, if you read all of the chapters, be aware that some information is intentionally repeated.

The Complete Developer

Whether you’ve been in the developer kitchen for decades or are just taking the plunge to do it yourself, The Complete Developer will show you how to build and implement every component of a modern stack—from scratch. You’ll go from a React-driven frontend to a fully fleshed-out backend with Mongoose, MongoDB, and a complete set of REST and GraphQL APIs, and back again through the whole Next.js stack. The book’s easy-to-follow, step-by-step recipes will teach you how to build a web server with Express.js, create custom API routes, deploy applications via self-contained microservices, and add a reactive, component-based UI. You’ll leverage command line tools and full-stack frameworks to build an application whose no-effort user management rides on GitHub logins. You’ll also learn how to: Work with modern JavaScript syntax, TypeScript, and the Next.js framework Simplify UI development with the React library Extend your application with REST and GraphQL APIs Manage your data with the MongoDB NoSQL database Use OAuth to simplify user management, authentication, and authorization Automate testing with Jest, test-driven development, stubs, mocks, and fakes Whether you’re an experienced software engineer or new to DIY web development, The Complete Developer will teach you to succeed with the modern full stack. After all, control matters. Covers: Docker, Express.js, JavaScript, Jest, MongoDB, Mongoose, Next.js, Node.js, OAuth, React, REST and GraphQL APIs, and TypeScript

Practical MongoDB Aggregations

Dive into the capabilities of the MongoDB aggregation framework with this official guide, "Practical MongoDB Aggregations". You'll learn how to design and optimize efficient aggregation pipelines for MongoDB 7.0, empowering you to handle complex data analysis and processing tasks directly within the database. What this Book will help me do Gain expertise in crafting advanced MongoDB aggregation pipelines for custom data workflows. Learn to perform time series analysis for financial datasets and IoT applications. Discover optimization techniques for working with sharded clusters and large datasets. Master array manipulation and other specific operations essential for MongoDB data models. Build pipelines that ensure data security and distribution while maintaining performance. Author(s) Paul Done, a recognized expert in MongoDB, brings his extensive experience in database technologies to this book. With years of practice in helping companies leverage MongoDB for big data solutions, Paul shares his deep knowledge in an accessible and logical manner. His approach to writing is hands-on, focusing on practical insights and clear explanations. Who is it for? This book is tailored for intermediate-level developers, database architects, data analysts, engineers, and scientists who use MongoDB. If you are familiar with MongoDB and looking to expand your understanding specifically around its aggregation capabilities, this guide is for you. Whether you're analyzing time series data or need to optimize pipelines for performance, you'll find actionable tips and examples here to suit your needs.

Learn T-SQL Querying - Second Edition

Troubleshoot query performance issues, identify anti-patterns in your code, and write efficient T-SQL queries with this guide for T-SQL developers Key Features A definitive guide to mastering the techniques of writing efficient T-SQL code Learn query optimization fundamentals, query analysis, and how query structure impacts performance Discover insightful solutions to detect, analyze, and tune query performance issues Purchase of the print or Kindle book includes a free PDF eBook Book Description Data professionals seeking to excel in Transact-SQL for Microsoft SQL Server and Azure SQL Database often lack comprehensive resources. Learn T-SQL Querying second edition focuses on indexing queries and crafting elegant T-SQL code enabling data professionals gain mastery in modern SQL Server versions (2022) and Azure SQL Database. The book covers new topics like logical statement processing flow, data access using indexes, and best practices for tuning T-SQL queries. Starting with query processing fundamentals, the book lays a foundation for writing performant T-SQL queries. You’ll explore the mechanics of the Query Optimizer and Query Execution Plans, learning to analyze execution plans for insights into current performance and scalability. Using dynamic management views (DMVs) and dynamic management functions (DMFs), you’ll build diagnostic queries. The book covers indexing and delves into SQL Server’s built-in tools to expedite resolution of T-SQL query performance and scalability issues. Hands-on examples will guide you to avoid UDF pitfalls and understand features like predicate SARGability, Query Store, and Query Tuning Assistant. By the end of this book, you‘ll have developed the ability to identify query performance bottlenecks, recognize anti-patterns, and avoid pitfalls What you will learn Identify opportunities to write well-formed T-SQL statements Familiarize yourself with the Cardinality Estimator for query optimization Create efficient indexes for your existing workloads Implement best practices for T-SQL querying Explore Query Execution Dynamic Management Views Utilize the latest performance optimization features in SQL Server 2017, 2019, and 2022 Safeguard query performance during upgrades to newer versions of SQL Server Who this book is for This book is for database administrators, database developers, data analysts, data scientists and T-SQL practitioners who want to master the art of writing efficient T-SQL code and troubleshooting query performance issues through practical examples. A basic understanding of T-SQL syntax, writing queries in SQL Server, and using the SQL Server Management Studio tool will be helpful to get started.

Azure Data Factory Cookbook - Second Edition

This comprehensive guide to Azure Data Factory shows you how to create robust data pipelines and workflows to handle both cloud and on-premises data solutions. Through practical recipes, you will learn to build, manage, and optimize ETL, hybrid ETL, and ELT processes. The book offers detailed explanations to help you integrate technologies like Azure Synapse, Data Lake, and Databricks into your projects. What this Book will help me do Master building and managing data pipelines using Azure Data Factory's latest versions and features. Leverage Azure Synapse and Azure Data Lake for streamlined data integration and analytics workflows. Enhance your ETL/ELT solutions with Microsoft Fabric, Databricks, and Delta tables. Employ debugging tools and workflows in Azure Data Factory to identify and solve data processing issues efficiently. Implement industry-grade best practices for reliable and efficient data orchestration and integration pipelines. Author(s) Dmitry Foshin, Tonya Chernyshova, Dmitry Anoshin, and Xenia Ireton collectively bring years of expertise in data engineering and cloud-based solutions. They are recognized professionals in the Azure ecosystem, dedicated to sharing their knowledge through detailed and actionable content. Their collaborative approach ensures that this book provides practical insights for technical audiences. Who is it for? This book is ideal for data engineers, ETL developers, and professional architects who work with cloud and hybrid environments. If you're looking to upskill in Azure Data Factory or expand your knowledge into related technologies like Synapse Analytics or Databricks, this is for you. Readers should have a foundational understanding of data warehousing concepts to fully benefit from the material.

Big Data Computing

This book primarily aims to provide an in-depth understanding of recent advances in big data computing technologies, methodologies, and applications along with introductory details of big data computing models such as Apache Hadoop, MapReduce, Hive, Pig, Mahout in-memory storage systems, NoSQL databases, and big data streaming services.

IBM FlashSystem and VMware Implementation and Best Practices Guide

This IBM® Redbooks® publication details the configuration and best practices for using the IBM FlashSystem® family of storage products within a VMware environment. The first version of this book was published in 2021 and specifically addressed IBM Spectrum® Virtualize Version 8.4 with VMware vSphere 7.0. This second version of this book includes all the enhancements that are available with IBM Spectrum Virtualize 8.5. Topics illustrate planning, configuring, operations, and preferred practices that include integration of IBM FlashSystem storage systems with the VMware vCloud suite of applications: VMware vSphere Web Client (vWC) vSphere Storage APIs - Storage Awareness (VASA) vSphere Storage APIs – Array Integration (VAAI) VMware Site Recovery Manager (SRM) VMware vSphere Metro Storage Cluster (vMSC) Embedded VASA Provider for VMware vSphere Virtual Volumes (vVols) This book is intended for presales consulting engineers, sales engineers, and IBM clients who want to deploy IBM FlashSystem storage systems in virtualized data centers that are based on VMware vSphere. Note: There is a newer version of this book: "IBM Storage Virtualize and VMware: Integrations, Implementation and Best Practices, SG24-8549". This book addresses IBM Storage Virtualize Version 8.6 with VMware vSphere 8. The new IBM Storage plugin for vSphere is covered in this book.

IBM TS7700 Release 5.3 Guide

This IBM Redbooks® publication covers IBM TS7700 R5.3. The IBM TS7700 is part of a family of IBM Enterprise tape products. This book is intended for system architects and storage administrators who want to integrate their storage systems for optimal operation. Building on over 25 years of experience, the R5.3 release includes many features that enable improved performance, usability, and security. Highlights include the IBM TS7700 Advanced Object Store, an all flash TS7770, grid resiliency enhancements, and Logical WORM retention. By using the same hierarchical storage techniques, the TS7700 (TS7770 and TS7760) can also off load to object storage. Because object storage is cloud-based and accessible from different regions, the TS7700 Cloud Storage Tier support essentially allows the cloud to be an extension of the grid. As of this writing, the TS7700C supports the ability to off load to IBM Cloud Object Storage, Amazon S3, and RSTOR. This publication explains features and concepts that are specific to the IBM TS7700 as of release R5.3. The R5.3 microcode level provides IBM TS7700 Cloud Storage Tier enhancements, IBM DS8000 Object Storage enhancements, Management Interface dual control security, and other smaller enhancements. The R5.3 microcode level can be installed on the IBM TS7770 and IBM TS7760 models only. TS7700 provides tape virtualization for the IBM Z® environment. Off loading to physical tape behind a TS7700 is used by hundreds of organizations around the world. New and existing capabilities of the TS7700 5.3 release includes the following highlights: Support for IBM TS1160 Tape Drives and JE/JM media Eight-way Grid Cloud, which consists of up to three generations of TS7700 Synchronous and asynchronous replication of virtual tape and TCT objects Grid access to all logical volume and object data independent of where it resides An all flash TS7770 option for improved performance Full Advanced Object Store Grid Cloud support of DS8000 Transparent Cloud Tier Full AES256 encryption for data that is in-flight and at-rest Tight integration with IBM Z and DFSMS policy management DS8000 Object Store with AES256 in-flight encryption and compression Regulatory compliance through Logical WORM and LWORM Retention support Cloud Storage Tier support for archive, logical volume versions, and disaster recovery Optional integration with physical tape 16 Gb IBM FICON® throughput that exceeds 4 GBps per TS7700 cluster Grid Resiliency Support with Control Unit Initiated Reconfiguration (CUIR) support IBM Z hosts view up to 3,968 3490 devices per TS7700 grid TS7770 Cache On Demand feature that uses capacity-based licensing TS7770 support of SSD within the VED server The TS7700T writes data by policy to physical tape through attachment to high-capacity, high-performance IBM TS1160, IBM TS1150, and IBM TS1140 tape drives that are installed in an IBM TS4500 or TS3500 tape library. The TS7770 models are based on high-performance and redundant IBM Power9® technology. They provide improved performance for most IBM Z tape workloads when compared to the previous generations of IBM TS7700.

IBM DS8000 Copy Services: Updated for IBM DS8000 Release 9.1

This IBM® Redbooks® publication helps you plan, install, configure, and manage Copy Services on the IBM DS8000® operating in an IBM Z® or Open Systems environment. This book helps you design and implement a new Copy Services installation or migrate from an existing installation. It includes hints and tips to maximize the effectiveness of your installation, and information about tools and products to automate Copy Services functions. It is intended for anyone who needs a detailed and practical understanding of the DS8000 Copy Services. This edition is an update for the DS8900 Release 9.1. Note that the Safeguarded Copy feature is covered in IBM DS8000 Safeguarded Copy, REDP-5506.

IBM and CMTG Cyber Resiliency: Building an Automated, VMware Aware Safeguarded Copy Solution to Provide Data Resilience

This IBM Blueprint outlines how CMTG and IBM have partnered to provide cyber resilient services to their clients. CMTG is one of Australia's leading private cloud providers based in Perth, Western Australia. The solution is based on IBM Storage FlashSystem, IBM Safeguarded Copy and IBM Storage Copy Data Management. The target audience for this Blueprint is IBM Storage technical specialists and storage admins.

Deciphering Data Architectures

Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they're also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of these architectures to help data professionals understand the pros and cons of each. James Serra, big data and data warehousing solution architect at Microsoft, examines common data architecture concepts, including how data warehouses have had to evolve to work with data lake features. You'll learn what data lakehouses can help you achieve, as well as how to distinguish data mesh hype from reality. Best of all, you'll be able to determine the most appropriate data architecture for your needs. With this book, you'll: Gain a working understanding of several data architectures Learn the strengths and weaknesses of each approach Distinguish data architecture theory from reality Pick the best architecture for your use case Understand the differences between data warehouses and data lakes Learn common data architecture concepts to help you build better solutions Explore the historical evolution and characteristics of data architectures Learn essentials of running an architecture design session, team organization, and project success factors Free from product discussions, this book will serve as a timeless resource for years to come.

IBM Storage Virtualize, IBM Storage FlashSystem, and IBM SAN Volume Controller Security Feature Checklist - For IBM Storage Virtualize 8.6

IBM® Storage Virtualize based storage systems are secure storage platforms that implement various security-related features, in terms of system-level access controls and data-level security features. This document outlines the available security features and options of IBM Storage Virtualize based storage systems. It is not intended as a "how to" or best practice document. Instead, it is a checklist of features that can be reviewed by a user security team to aid in the definition of a policy to be followed when implementing IBM FlashSystem®, IBM SAN Volume Controller, and IBM Storage Virtualize for Public Cloud. IBM Storage Virtualize features the following levels of security to protect against threats and to keep the attack surface as small as possible: The first line of defense is to offer strict verification features that stop unauthorized users from using login interfaces and gaining access to the system and its configuration. The second line of defense is to offer least privilege features that restrict the environment and limit any effect if a malicious actor does access the system configuration. The third line of defense is to run in a minimal, locked down, mode to prevent damage spreading to the kernel and rest of the operating system. The fourth line of defense is to protect the data at rest that is stored on the system from theft, loss, or corruption (malicious or accidental). The topics that are discussed in this paper can be broadly split into two categories: System security: This type of security encompasses the first three lines of defense that prevent unauthorized access to the system, protect the logical configuration of the storage system, and restrict what actions users can perform. It also ensures visibility and reporting of system level events that can be used by a Security Information and Event Management (SIEM) solution, such as IBM QRadar®. Data security: This type of security encompasses the fourth line of defense. It protects the data that is stored on the system against theft, loss, or attack. These data security features include Encryption of Data At Rest (EDAR) or IBM Safeguarded Copy (SGC). This document is correct as of IBM Storage Virtualize 8.6.

Mastering MongoDB 7.0 - Fourth Edition

Discover the many capabilities of MongoDB 7.0 with this comprehensive guide designed to take your database skills to new heights. By exploring advanced features like aggregation pipelines, role-based security, and MongoDB Atlas, you will gain in-depth expertise in modern data management. This book empowers you to create secure, high-performance database applications. What this Book will help me do Understand and implement advanced MongoDB queries for detailed data analysis. Apply optimized indexing techniques to maximize query performance. Leverage MongoDB Atlas for robust monitoring, efficient backups, and advanced integrations. Develop secure applications with role-based access control, auditing, and encryption. Create scalable and innovative solutions using the latest features in MongoDB 7.0. Author(s) Marko Aleksendrić, Arek Borucki, and their co-authors are accomplished experts in database engineering and MongoDB development. They bring collective experience in teaching and practical application of MongoDB solutions across various industries. Their goal is to simplify complex topics, making them approachable and actionable for developers worldwide. Who is it for? This book is written for developers, software engineers, and database administrators with experience in MongoDB who want to deepen their expertise. An understanding of basic database operations and queries is recommended. If you are looking to master advanced concepts and create secure, optimized, and scalable applications, this is the book for you.

IBM Storage Fusion Multicloud Object Gateway

This Redpaper provides an overview of IBM Storage Fusion Multicloud Object Gateway (MCG) and can be used as a quick reference guide for the most common use cases. The intended audience is cloud and application administrators, as well as other technical staff members who wish to learn how MCG works, how to set it up, and usage of a Backing Store or Namespace Store, as well as object caching.

Take Control of iOS & iPadOS Privacy and Security, 4th Edition

Master networking, privacy, and security for iOS and iPadOS! Version 4.2, updated January 29, 2024 Ensuring that your iPhone or iPad’s data remains secure and in your control and that your private data remains private isn’t a battle—if you know what boxes to check and how to configure iOS and iPadOS to your advantage. Take Control of iOS & iPadOS Privacy and Security takes you into the intricacies of Apple’s choices when it comes to networking, data sharing, and encryption—and protecting your personal safety. Substantially updated to cover dozens of changes and new features in iOS 17 and iPadOS 17! Your iPhone and iPad have become the center of your digital identity, and it’s easy to lose track of all the ways in which Apple and other parties access your data legitimately—or without your full knowledge and consent. While Apple nearly always errs on the side of disclosure and permission, many other firms don’t. This book comprehensively explains how to configure iOS 17, iPadOS 17, and iCloud-based services to best protect your privacy with messaging, email, browsing, and much more. The book also shows you how to ensure your devices and data are secure from intrusion from attackers of all types. You’ll get practical strategies and configuration advice to protect yourself against psychological and physical threats, including restrictions on your freedom and safety. For instance, you can now screen images that may contain nude images, while Apple has further enhanced Lockdown Mode to block potential attacks by governments, including your own. Take Control of iOS & iPadOS Privacy and Security covers how to configure the hundreds of privacy and data sharing settings Apple offers in iOS and iPadOS, and which it mediates for third-party apps. Safari now has umpteen different strategies built in by Apple to protect your web surfing habits, personal data, and identity, and new features in Safari, Mail, and Messages that block tracking of your movement across sites, actions on ads, and even when you open and view an email message. In addition to privacy and security, this book also teaches you everything you need to know about networking, whether you’re using 3G, 4G LTE, or 5G cellular, Wi-Fi or Bluetooth, or combinations of all of them; as well as about AirDrop, AirPlay, Airplane Mode, Personal Hotspot, and tethering. You’ll learn how to:

Twiddle 5G settings to ensure the best network speeds on your iPhone or iPad. Master the options for a Personal Hotspot for yourself and in a Family Sharing group. Set up a device securely from the moment you power up a new or newly restored iPhone or iPad. Manage Apple’s built-in second factor verification code generator for extra-secure website and app logins. Create groups of passwords and passkeys you can share securely with other iPhone, iPad, and Mac users. Decide whether Advanced Data Protection in iCloud, an enhanced encryption option that makes nearly all your iCloud data impossible for even Apple to view, makes sense for you. Use passkeys, a high-security but easy-to-use website login system with industry-wide support. Block unknown (and unwanted) callers, iMessage senders, and phone calls, now including FaceTime. Protect your email by using Hide My Email, a iCloud+ tool to generate an address Apple manages and relays messages through for you—now including email used with Apple Pay transactions. Use Safari’s blocking techniques and how to review websites’ attempts to track you, including the latest improvements in iOS 17 and iPadOS 17. Use Communication Safety, a way to alert your children about sensitive images—but now also a tool to keep unsolicited and unwanted images of private parts from appearing on your devices. Understand why Apple might ask for your iPhone, iPad, or Mac password when you log in on a new device using two-factor authentication. Keep yourself safe when en route to a destination by creating a Check In partner who will be alerted if you don’t reach your intended end point or don’t respond within a period of time. Dig into Private Browsing’s several new features in iOS 17/iPadOS 17, designed to let you leave no trace of your identity or actions behind, while protecting your iPhone or iPad from prying eyes, too. Manage data usage across two phone SIMs (or eSIMS) at home and while traveling. Use a hardware encryption key to strongly protect your Apple ID account. Share a Wi-Fi password with nearby contacts and via a QR Code. Differentiate between encrypted data sessions and end-to-end encryption. Stream music and video to other devices with AirPlay 2. Use iCloud+’s Private Relay, a privacy-protecting browsing service that keeps your habits and locations from prying marketing eyes. Deter brute-force cracking by relying on an Accessories timeout for devices physically being plugged in that use USB and other standards. Configure Bluetooth devices. Enjoy enhanced AirDrop options that let you tap two iPhones to transfer files and continue file transfers over the internet when you move out of range. Protect Apple ID account and iCloud data from unwanted access at a regular level and via the new Safety Check, designed to let you review or sever digital connections with people you know who may wish you harm.

Building Information Modeling

This book presents how Building Information Modeling (BIM) and the use of shared representation of built assets facilitate design, construction and operation processes (ISO 19650). The modeling of public works data disrupts the art of construction. Written by both academics and engineers who are heavily involved in the French research project Modélisation des INformations INteropérables pour les INfrastructues Durables (MINnD) as well as in international standardization projects, this book presents the challenges of BIM from theoretical and practical perspectives. It provides knowledge for evolving in an ecosystem of federated models and common data environments, which are the basis of the platforms and data spaces. BIM makes it possible to handle interoperability very concretely, using open standards, which lead to openBIM. The use of a platform allows for the merging of business software and for approaches such as a Geographic Information System (GIS) to be added to the processes. In organizations, BIM meets the life cycles of structures and circular economy. It is not only a technique that reshapes cooperation and trades around a digital twin but can also disrupt organizations and business models.