talk-data.com talk-data.com

Topic

data-engineering

3395

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

3395 activities · Newest first

Data Warehousing in the Age of Artificial Intelligence

Nearly 7,000 new mobile applications appear every day, and a constant stream of data gives them life. Many organizations rely on a predictive analytics model to turn data into useful business information and ensure the predictions remain accurate as data changes. It can be a complex, time-consuming process. This book shows how to automate and accelerate that process using machine learning (ML) on a modern data warehouse that runs on any cloud. Product specialists from MemSQL explain how today’s modern data warehouses provide the foundations to implement ML algorithms that run efficiently. Through several real-time use cases, you’ll learn how to quickly identify the right metrics to make actionable business decisions. This book explores foundational ML and artificial intelligence concepts to help you understand: How data warehouses accelerate deployment and simplify manageability How companies make a choice between cloud and on-premises deployments for building data processing applications Ways to build analytics and visualizations for business intelligence on historical data The technologies and architecture for building and deploying real-time data pipelines This book demonstrates specific models and examples for building supervised and unsupervised real-time ML applications, and gives practical advice on how to make the choice between building an ML pipeline or buying an existing solution. If you need to use data accurately and efficiently, a real-time data warehouse is a critical business tool.

Introduction to GPUs for Data Analytics

Moore’s law has finally run out of steam for CPUs. The number of x86 cores that can be placed cost-effectively on a single chip has reached a practical limit, making higher densities prohibitively expensive for most applications. Fortunately, for big data analytics, machine learning, and database applications, a more capable and cost-effective alternative for scaling compute performance is already available: the graphics processing unit, or GPU. In this report, executives at Kinetica and Sierra Communications explain how incorporating GPUs is ideal for keeping pace with the relentless growth in streaming, complex, and large data confronting organizations today. Technology professionals, business analysts, and data scientists will learn how their organizations can begin implementing GPU-accelerated solutions either on premise or in the cloud. This report explores: How GPUs supplement CPUs to enable continued price/performance gains The many database and data analytics applications that can benefit from GPU acceleration Why GPU databases with user-defined functions (UDFs) can simplify and unify the machine learning/deep learning pipeline How GPU-accelerated databases can process streaming data from the Internet of Things and other sources in real time The performance advantage of GPU databases in demanding geospatial analytics applications How cognitive computing—the most compute-intensive application currently imaginable—is now within reach, using GPUs

Learning Ceph - Second Edition

Dive into 'Learning Ceph' to master Ceph, the powerful open-source storage solution known for its scalability and reliability. By following the book's clear instructions, you'll be equipped to deploy, configure, and integrate Ceph into your infrastructure for exabyte-scale data management. What this Book will help me do Understand the architectural principles of Ceph and its uses. Gain practical skills in deploying and managing a Ceph cluster. Learn to monitor and troubleshoot Ceph systems effectively. Explore integration possibilities with OpenStack and other platforms. Apply advanced techniques like erasure coding and CRUSH map optimization. Author(s) The authors are experienced software engineers and open-source contributors with deep expertise in storage systems and distributed computing. They bring practical, real-world examples and accessible explanations to complex topics like Ceph architecture and operation. Their passion for empowering professionals with robust technical skills shines through in this book. Who is it for? This book is ideal for system administrators, cloud engineers, or storage professionals looking to expand their knowledge of software-defined storage solutions. Whether you're new to Ceph or seeking advanced tips for optimization, this guide has something for every skill level. Prerequisite knowledge includes familiarity with Linux and server architecture concepts.

PostgreSQL: Up and Running, 3rd Edition

Thinking of migrating to PostgreSQL? This clear, fast-paced introduction helps you understand and use this open source database system. Not only will you learn about the enterprise class features in versions 9.5 to 10, you’ll also discover that PostgeSQL is more than a database system—it’s an impressive application platform as well. With examples throughout, this book shows you how to achieve tasks that are difficult or impossible in other databases. This third edition covers new features, such as ANSI-SQL constructs found only in proprietary databases until now: foreign data wrapper (FDW) enhancements; new full text functions and operator syntax introduced in version 9.6; XML constructs new in version 10; query parallelization features introduced in 9.6 and enhanced in 10; built-in logical replication introduced in Version 10.e. If you’re a current PostgreSQL user, you’ll pick up gems you may have missed before. Learn basic administration tasks such as role management, database creation, backup, and restore Apply the psql command-line utility and the pgAdmin graphical administration tool Explore PostgreSQL tables, constraints, and indexes Learn powerful SQL constructs not generally found in other databases Use several different languages to write database functions Tune your queries to run as fast as your hardware will allow Query external and variegated data sources with foreign data wrappers Learn how to use built-in replication to replicate data

SamsTeachYourself PHP, MySQL & JavaScript: All in One, 6th Edition

In just a short time, you can learn how to use PHP, MySQL, and JavaScript together to create dynamic, interactive websites and applications using three leading web development technologies. No previous programming experience is required. Using a straightforward, step-by-step approach, each lesson in this book builds on the previous ones, enabling you to learn the essentials of full-stack web application development – from HTML, CSS, and JavaScript on the front end, to PHP scripting and MySQL databases on the server. Regardless of whether you run Linux, Windows, or MacOS, the book includes complete instructions to install all the software you need to set up a stable environment for learning, testing, and production. Step-by-step instructions carefully walk you through the most common web application development tasks. Practical, hands-on examples show you how to apply what you learn. Quizzes and exercises help you test your knowledge and stretch your skills. Learn how to: Build web pages with HTML5 and CSS Use JavaScript to build dynamic, interactive web pages Get PHP, MySQL, and JavaScript to work together to create modern, standards-compliant web applications Enhance interactivity with AJAX Leverage JavaScript libraries such as jQuery Work with cookies and user sessions Get user input with web-based forms Use basic SQL commands Interact with the MySQL database using PHP Write maintainable code and get started with version control Decide when frameworks such as Bootstrap, Foundation, React, Angular, and Laravel can be useful Create a web-based discussion forum or calendar Add a storefront and shopping cart to your site Contents at a Glance PART I Web Application Basics 1 Understanding How the Web Works 2 Structuring HTML and Using Cascading Style Sheets 3 Understanding the CSS Box Model and Positioning 4 Introducing JavaScript 5 Introducing PHP PART II Getting Started with Dynamic Web Sites 6 Understanding Dynamic Web Sites and HTML5 Applications 7 JavaScript Fundamentals: Variables, Strings, and Arrays 8 JavaScript Fundamentals: Functions, Objects, and Flow Control 9 Understanding JavaScript Event Handling 10 The Basics of Using jQuery PART III Taking Your Web Applications to the Next Level 11 AJAX: Getting Started with Remote Scripting 12 PHP Fundamentals: Variables, Strings, and Arrays 13 PHP Fundamentals: Functions, Objects, and Flow Control 14 Working with Cookies and User Sessions 15 Working with Web-Based Forms PART IV Integrating a Database into Your Applications 16 Understanding the Database Design Process 17 Learning Basic SQL Commands 18 Interacting with MySQL Using PHP PART V Getting Started with Application Development 19 Creating a Simple Discussion Forum 20 Creating an Online Storefront 21 Creating a Simple Calendar 22 Managing Web Applications PART VI Appendixes A Installation QuickStart with XAMPP B Installing and Configuring MySQL C Installing and Configuring Apache D Installing and Configuring PHP

Exam Ref 70-764 Administering a SQL Database Infrastructure

Prepare for Microsoft Exam 70-764—and help demonstrate your real-world mastery of skills for database administration. This exam is intended for database administrators charged with installation, maintenance, and configuration tasks. Their responsibilities also include setting up database systems, making sure those systems operate efficiently, and regularly storing, backing up, and securing data from unauthorized access. Focus on the expertise measured by these objectives: Configure data access and auditing Manage backup and restore of databases Manage and monitor SQL Server instances Manage high availability and disaster recovery This Microsoft Exam Ref: Organizes its coverage by exam objectives Features strategic, what-if scenarios to challenge you Assumes you have working knowledge of database installation, configuration, and maintenance tasks. You should also have experience with setting up database systems, ensuring those systems operate efficiently, regularly storing and backing up data, and securing data from unauthorized access. About the Exam Exam 70-764 focuses on skills and knowledge required for database administration. About Microsoft Certification Passing both Exam 70-764 and Exam 70-765 (Provisioning SQL Databases) earns you credit toward an MCSA: SQL 2016 Database Administration certification. See full details at: microsoft.com/learning

Oracle Database 12c Release 2 Testing Tools and Techniques for Performance and Scalability

Master Oracle Database 12c Release 2 testing and tuning Seamlessly transition to Oracle Database 12c Release 2 and achieve peak performance using the step-by-step instruction and best practices contained in this Oracle Press guide. Written by a team of Oracle ACEs, Oracle Database 12c Release 2 Testing Tools and Techniques for Performance and Scalability clearly explains how to identify, investigate, and resolve performance issues. You will discover how to use troubleshooting tools and test rigs, optimize code and queries, evaluate database performance, perform realistic application testing, capture and replay actual production workloads, and employ Oracle Database In-Memory. •Establish benchmarks and evaluate application workload performance •Configure and deploy SQL Tuning Advisor and SQL Access Advisor •Maximize efficiency using Oracle Database In-Memory and In-Memory Advisor •Identify and repair poorly running code with SQL Monitor •Uncover database problems using Real-Time ADDM and Emergency Monitoring •Work with database workload capture and replay •Analyze third-party code with Workload Intelligence •Identify database objects that will benefit most from In-Memory Column Store (IMCS) •Monitor and manage IMCS objects with In-Memory Central

Manage Your SAP Projects with SAP Activate

Dive into SAP Activate, a cutting-edge methodology for SAP S/4HANA implementation, designed to enhance your project management effectiveness. This book delivers a step-by-step introduction to the SAP Activate framework, covering Agile and Scrum approaches. You will learn how this framework facilitates achieving project objectives efficiently, providing you with the tools to streamline your SAP projects. What this Book will help me do Understand the key components and significance of SAP S/4HANA. Learn the framework and pillars of SAP Activate for successful SAP implementation. Master application of Agile and Scrum methodologies within SAP projects. Explore real-world case studies demonstrating SAP Activate in action. Develop a sample project using the SAP Activate framework to build hands-on expertise. Author(s) Vinay Singh is a seasoned SAP consultant with extensive experience in SAP implementations across various industries. With a focus on methodical and actionable guidance, Vinay has crafted his writing to help readers excel in practical SAP implementation. His work is complemented by a rich understanding of Agile methodologies applied to SAP contexts. Who is it for? This book is ideal for SAP professionals and consultants aspiring to efficiently implement and manage SAP projects using the SAP Activate approach. It is especially beneficial for those familiar with SAP HANA looking to transition from traditional waterfall methods to more agile frameworks. Readers seeking to enhance their project management skillset for SAP S/4HANA will find this book indispensable.

IBM Spectrum Virtualize Considerations for PCI-DSS Compliance

The Payment Card Industry Data Security Standard (PCI-DSS) is the global information security standard for organizations that process, store, or transmit data with any of the major credit card brands. More and more organizations are looking for compliance with this standard. This IBM® Redpaper™ describes how the features and functions of IBM Spectrum™ Virtualize help organizations towards compliance of their IT infrastructure on relevant areas of the PCI-DSS standard. IBM Spectrum Virtualize™ is the software common to all IBM Storwize® products such as IBM SAN Volume Controller (SVC), IBM Storwize V5000 family, IBM Storwize V7000, IBM FlashSystem® V9000, and IBM Spectrum Virtualize as Software. Therefore, all recommendations in this paper equally apply to these storage products.

Web Development with MongoDB and Node - Third Edition

Explore the power of combining Node.js and MongoDB to build modern, scalable web applications in 'Web Development with MongoDB and Node.' You'll not only learn how to integrate these two technologies effectively, but you'll also gain practical insights into using modern frameworks like Express and Angular to build feature-rich web apps. What this Book will help me do Master core concepts of Node.js and MongoDB for efficient web development. Learn to build and configure a web server using the Express.js framework. Implement data persistence with MongoDB using the Mongoose ODM library. Automate testing using tools like Mocha and streamline workflows with Gulp. Deploy applications to cloud platforms like Heroku, AWS, and Microsoft Azure. Author(s) Jason Krol and None Joseph D'mello, along with None Satheesh, bring extensive experience in web development and technical writing to this book. The authors have collectively worked on cutting-edge web technologies for years and are passionate about sharing their expertise to help developers create efficient web applications. Who is it for? This book is perfect for JavaScript developers at any proficiency level who are looking to expand their skills into full-stack development with Node.js and MongoDB. Even if you have a basic understanding of JavaScript and HTML, this book will guide you through building complete web applications from scratch. If you're eager to learn and create performant, scalable web apps, this book is for you.

IBM Copy Services Manager Implementation Guide

Abstract This IBM® Redbooks® publication provides an overview of IBM Copy Services Manager (CSM) for IBM Z and open systems, and documents a set of scenarios for using IBM Copy Services manager to automate and manage replication tasks based on IBM Storage. This book reviews and explains the usage of copy services functions and describes how these functions are implemented in IBM Copy Services Manager. IBM Copy Services Manager key concepts, architecture, session types and usage, and new functionality as of IBM Copy Services Manager version 6.1 are also described.

Practical Real-time Data Processing and Analytics

This book provides a comprehensive guide to real-time data processing and analytics using modern frameworks like Apache Spark, Flink, Storm, and Kafka. Through practical examples and in-depth explanations, you will learn how to implement efficient, scalable, real-time processing pipelines. What this Book will help me do Understand real-time data processing essentials and the technology stack Learn integration of components like Apache Spark and Kafka Master the concepts of stream processing with detailed case studies Gain expertise in developing monitoring and alerting solutions for real-time systems Prepare to implement production-grade real-time data solutions Author(s) Shilpi Saxena and Saurabh Gupta, the authors, are experienced professionals in distributed systems and data engineering, focusing on practical applications of real-time computing. They bring their extensive industry experience to this book, helping readers understand the complexities of real-time data solutions in an approachable and hands-on manner. Who is it for? This book is ideal for software engineers and data engineers with a background in Java who seek to develop real-time data solutions. It is suitable for readers familiar with concepts of real-time data processing, and enhances knowledge in frameworks like Spark, Flink, Storm, and Kafka. Target audience includes learners building production data solutions and those designing distributed analytics engines.

IBM z14 Configuration Setup

Abstract IThis IBM® Redbooks® publication helps you install, configure, and maintain the IBM z14. The z14 offers new functions that require a comprehensive understanding of the available configuration options. This book presents configuration setup scenarios, and describes implementation examples in detail. This publication is intended for systems engineers, hardware planners, and anyone who needs to understand IBM Z configuration and implementation. Readers should be generally familiar with current IBM Z technology and terminology. For more information about the functions of the z14, see IBM z14 Technical Introduction, SG24-8450 and IBM z14 Technical Guide, SG24-8451.

Apache Spark 2.x Machine Learning Cookbook

This book is your gateway to mastering machine learning with Apache Spark 2.x. Through detailed hands-on recipes, you'll delve into building scalable ML models, optimizing big data processes, and enhancing project efficiency. Gain practical knowledge and explore real-world applications of recommendations, clustering, analytics, and more with Spark's powerful capabilities. What this Book will help me do Understand how to integrate Scala and Spark for effective machine learning development. Learn to create scalable recommendation engines using Spark. Master the development of clustering systems to organize unlabelled data at scale. Explore Spark libraries to implement efficient text analytics and search engines. Optimize large-scale data operations, tackling high-dimensional issues with Spark. Author(s) The team of authors brings expertise in machine learning, data science, and Spark technologies. Their combined industry experience and academic knowledge ensure the book is grounded in practical applications while offering theoretical insights. With clear explanations and a step-by-step approach, they aim to simplify complex concepts for developers and data scientists. Who is it for? This book is crafted for Scala developers familiar with machine learning concepts but seeking practical applications with Spark. If you have been implementing models but want to scale them and leverage Spark's robust ecosystem, this guide will serve you well. It is ideal for professionals seeking to deepen their skills in Spark and data science.

Kafka: The Definitive Guide

Every enterprise application creates data, whether it’s log messages, metrics, user activity, outgoing messages, or something else. And how to move all of this data becomes nearly as important as the data itself. If you’re an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how to use this open source streaming platform to handle real-time data feeds. Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Through detailed examples, you’ll learn Kafka’s design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer. Understand publish-subscribe messaging and how it fits in the big data ecosystem. Explore Kafka producers and consumers for writing and reading messages Understand Kafka patterns and use-case requirements to ensure reliable data delivery Get best practices for building data pipelines and applications with Kafka Manage Kafka in production, and learn to perform monitoring, tuning, and maintenance tasks Learn the most critical metrics among Kafka’s operational measurements Explore how Kafka’s stream delivery capabilities make it a perfect source for stream processing systems

IBM zPDT 2017 Sysplex Extensions

Abstract This IBM® Redbooks® publication describes the IBM System z® Personal Development Tool (IBM zPDT®) 2017 Sysplex Extensions, which is a package that consists of sample files and supporting documentation to help you get a functioning, data sharing sysplex up and running with minimal time and effort. This book is a significant revision of zPDT 2016 Sysplex Extensions, SG24-8315, which is still available online for readers who need the IBM z/OS® 2.1 level of this package. This package is designed and tested to be installed on top of a standard Application Developer Controlled Distribution (ADCD) environment. It provides the extra files that you need to create a two-way data sharing IBM z/OS 2.2 sysplex that runs under IBM z/VM® in a zPDT environment.

High Availability for Oracle Database with IBM PowerHA SystemMirror and IBM Spectrum Virtualize HyperSwap

This IBM® Redpaper™ publication describes the use of the IBM Spectrum™ Virtualize HyperSwap® function to provide a high availability (HA) storage infrastructure for Oracle databases across metro distances, using the IBM SAN Volume Controller. The HyperSwap function is available on all IBM storage technologies that use IBM Spectrum Virtualize™ software, which include the IBM SAN Volume Controller, IBM Storwize® V5000, IBM Storwize V7000, IBM FlashSystem® V9000, and IBM Spectrum Virtualize as software. This paper focuses on the functional behavior of HyperSwap when subjected to various failure conditions and provides detailed timings and error recovery sequences that occur in response to these failure conditions. This paper does not provide the details necessary to implement the reference architectures (although some implementation detail is provided).

IBM TS4500 R4 Tape Library Guide

Abstract The IBM® TS4500 (TS4500) tape library is a next-generation tape solution that offers higher storage density and integrated management than previous solutions. This IBM Redbooks® publication gives you a close-up view of the new IBM TS4500 tape library. In the TS4500, IBM delivers the density that today's and tomorrow's data growth requires. It has the cost-effectiveness and the manageability to grow with business data needs, while you preserve existing investments in IBM tape library products. Now, you can achieve both a low cost per terabyte (TB) and a high TB density per square foot, because the TS4500 can store up to 8.25 petabytes (PB) of uncompressed data in a single frame library or scale up at 1.5 PB per square foot to over 263 PB, which is more than 4 times the capacity of the IBM TS3500 tape library. The TS4500 offers these benefits: High availability dual active accessors with integrated service bays to reduce inactive service space by 40%. The Elastic Capacity option can be used to completely eliminate inactive service space. Flexibility to grow: The TS4500 library can grow from both the right side and the left side of the first L frame because models can be placed in any active position. Increased capacity: The TS4500 can grow from a single L frame up to an additional 17 expansion frames with a capacity of over 23,000 cartridges. High-density (HD) generation 1 frames from the existing TS3500 library can be redeployed in a TS4500. Capacity on demand (CoD): CoD is supported through entry-level, intermediate, and base-capacity configurations. Advanced Library Management System (ALMS): ALMS supports dynamic storage management, which enables users to create and change logical libraries and configure any drive for any logical library. Support for the IBM TS1155 while also supporting TS1150 and TS1140 tape drive: The TS1155 gives organizations an easy way to deliver fast access to data, improve security, and provide long-term retention, all at a lower cost than disk solutions. The TS1155 offers high-performance, flexible data storage with support for data encryption. Also, this enhanced fifth-generation drive can help protect investments in tape automation by offering compatibility with existing automation. The new TS1155 Tape Drive Model 55E delivers a 10 Gb Ethernet host attachment interface optimized for cloud-based and hyperscale environments. The TS1155 Tape Drive Model 55F delivers a native data rate of 360 MBps, the same load/ready, locate speeds, and access times as the TS1150, and includes dual-port 8 Gb Fibre Channel support. Support of the IBM Linear Tape-Open (LTO) Ultrium 7 tape drive: The LTO Ultrium 7 offering represents significant improvements in capacity, performance, and reliability over the previous generation, LTO Ultrium 6, while they still protect your investment in the previous technology. Integrated TS7700 back-end Fibre Channel (FC) switches are available. Up to four library-managed encryption (LME) key paths per logical library are available. This book describes the TS4500 components, feature codes, specifications, supported tape drives, encryption, new integrated management console (IMC), and command-line interface (CLI). You learn how to accomplish several specific tasks: Improve storage density with increased expansion frame capacity up to 2.4 times and support 33% more tape drives per frame. Manage storage by using the ALMS feature. Improve business continuity and disaster recovery with dual active accessor, automatic control path failover, and data path failover. Help ensure security and regulatory compliance with tape-drive encryption and Write Once Read Many (WORM) media. Support IBM LTO Ultrium 7, 6, and 5, IBM TS1155, TS1150, and TS1140 tape drives. Provide a flexible upgrade path for users who want to expand their tape storage as their needs grow. Reduce the storage footprint and simplify cabling with 10 U of rack space on top of the library. This guide is for anyone who wants to understand more about the IBM TS4500 tape library. It is particularly suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists.

Learn FileMaker Pro 16: The Comprehensive Guide to Building Custom Databases

Extend FileMaker's built-in functionality and totally customize your data management environment with specialized functions and menus to super-charge the results and create a truly unique and focused experience. This book includes everything a beginner needs to get started building databases with FileMaker and contains advanced tips and techniques that the most seasoned professionals will appreciate. Written by a long time FileMaker developer, this book contains material for developers of every skill level. FileMaker Pro 16 is a powerful database development application used by millions of people in diverse industries to simplify data management tasks, leverage their business information in new ways and automate many mundane tasks. A custom solution built with FileMaker can quickly tap into a powerful set of capabilities and technologies to offer users an intuitive and pleasing environment in which to achieve new levels of efficiency and professionalism. What You’ll learn Create SQL queries to build fast and efficient formulas Discover new features of version 16 such as JSON functions, Cards, Layout Object window, SortValues, UniqueValues, using variables in Data Sources Write calculations using built-in and creating your own custom functions Discover the importance of a good approach to interface and technical design Apply best practices for naming conventions and usage standards Explore advanced topics about designing professional, open-ended solutions and using advanced techniques Who This Book Is For Casual programmers, full time consultants and IT professionals.