talk-data.com talk-data.com

Topic

data-engineering

3377

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
Learning ELK Stack

Dive into the ELK stack-Elasticsearch, Logstash, and Kibana-with this comprehensive guide. Designed to help you set up, configure, and utilize the stack to its fullest, this book provides you with the skills to manage data with precision, enrich logs, and create meaningful analytics. Develop an entire data pipeline and cultivate powerful visual insights from your data. What this Book will help me do Install and configure Elasticsearch, Logstash, and Kibana to establish a robust ELK stack setup. Understand the role of each component in the stack and master the basics of log analysis. Create custom Logstash plugins to handle non-standard data processing requirements. Develop interactive and insightful data visualizations and dashboards using Kibana. Implement a complete data pipeline and gain expertise in data indexing, searching, and reporting. Author(s) None Chhajed brings depth of technical understanding and practical experience to the exploration of the ELK Stack. With a strong background in open-source technologies and data analytics, Chhajed has worked extensively with ELK stack implementations in real-world scenarios. Through this guide, the author offers clarity, detailed examples, and actionable insights for professionals seeking to improve their data systems. Who is it for? This book is targeted towards software developers, data analysts, and DevOps engineers seeking to harness the potential of the ELK stack for data analysis and logging. It is most suitable for intermediate-level professionals with basic knowledge of Unix or programming. If your aim is to gain insights and build metrics from diverse data formats utilizing open-source technologies, this book is crafted for you.

Beginning SAP Fiori

Take a deep dive into SAP Fiori and discover Fiori architecture, Fiori landscape installation, Fiori standard applications, Fiori Launchpad configuration, tools for developing Fiori applications and extending standard Fiori applications. You will learn: Fiori architecture and its applications Setting up a Fiori landscape and Fiori Launchpad Configuring, customizing and enhancing standard Fiori applications Developing Fiori native applications for mobile Internet of Things-based custom Fiori applications with the HANA cloud platform Bince Mathew, a SAP mobility expert working for an MNC in Germany, shows you how SAP Fiori, based on HTML5 technology, addresses the most widely and frequently used SAP transactions like purchase order approvals, sales order creation, information lookup, and self-service tasks. This set of HTML5 apps provides a very simple and accessible experience across desktops, tablets, and smartphones.

Learning Couchbase

Embark on your journey to mastering Couchbase with this comprehensive guide designed for learners of all levels. By exploring the fundamentals of NoSQL databases and diving into Couchbase's functionality, you'll gain the skills to design, manage, and scale modern applications effectively. Learn practical solutions and techniques to leverage Couchbase as a powerful backend system. What this Book will help me do Understand the core concepts of NoSQL databases and configure a Couchbase database system from scratch. Design efficient document data schemas and use Couchbase SDKs for high-performance application development. Explore the integration of Couchbase with Elasticsearch to implement robust full-text search capabilities. Master advanced Couchbase features like XDCR for disaster recovery and N1QL for SQL-like application queries. Develop and scale a real-world e-commerce application using Couchbase as the backend database system. Author(s) Henry Potsangbam is an experienced software developer and database specialist with a focus on scalable NoSQL solutions. He has worked extensively with Couchbase in developing real-world applications and is passionate about teaching others the intricacies of database systems. Henry's writing style makes advanced concepts accessible and practical for readers of all levels. Who is it for? This book is crafted for developers, database administrators, and IT professionals who want to learn NoSQL database basics and Couchbase's capabilities. Beginners with no prior experience in NoSQL will find step-by-step guidance, and experienced developers can expand their skill set to include Couchbase. A familiarity with Java programming will be helpful but is not mandatory.

Apache Cassandra Essentials

"Apache Cassandra Essentials" is your guide to understanding and mastering the core concepts of Apache Cassandra. Whether you're setting up your first Cassandra cluster or optimizing performance, this book provides actionable steps and insights to help you design highly responsive database architectures. What this Book will help me do Set up and configure a Cassandra cluster for optimal performance. Design schemas in Cassandra using CQL for evenly distributed data. Employ tools to monitor and maintain Cassandra clusters effectively. Debug queries to improve database query performance. Tune Cassandra to adapt to specific operational environments. Author(s) Nitin Padalia, the author, is an experienced database engineer with a deep understanding of distributed systems. With years of experience working with Apache Cassandra and similar technologies, he has dedicated his efforts to simplifying complex concepts for developers. His clear and straightforward writing helps readers build expertise efficiently. Who is it for? This book is perfect for developers who are already familiar with Cassandra and want a deeper understanding of its architecture and functionality. If you're interested in diving into the non-relational aspects of Cassandra or need guidance on database optimization, you'll find this book invaluable. It's designed for those ready to advance their skills and maximize the potential of their Cassandra deployments.

IBM Storwize V7000, Spectrum Virtualize, HyperSwap, and VMware Implementation

IBM® Spectrum Virtualize Software Version 7.5 provides software-defined storage capabilities across various platforms, including IBM SAN Volume Controller, IBM Storwize® V7000, Storwize V7000 (Unified), Storwize V5000, Storwize V3700, and Storwize V3500. These offerings help clients reduce the complexities and cost of managing their storage in the following ways: Centralizing management of storage volumes to enable administrators to manage storage volumes from a single point Improving utilization of storage capacity with virtual volumes to enable businesses to tap into previously unused disk capacity Avoiding downtime for backups, maintenance, and upgrades Performing data migration without disruption to applications Enabling all storage devices to be organized into storage pools from which virtual volumes, whether standard, compressed, or thin-provisioned, are created with the characteristics that you want Delivering automation of storage management with SmartCloud Virtual Storage Center, IBM Tivoli® Storage Productivity Center (as applicable by platform), and IBM Tivoli Storage FlashCopy® Manager (as applicable by platform) Increasing the performance efficiency of storage pools with IBM Easy Tier® Restoring data access quickly with near and remote copy capabilities across Fibre Channel (FC), Fibre Channel over Ethernet (FCoE), and IP networks In this IBM Redbooks® publication, which is aimed at storage administrators and technical professionals, we describe the IBM HyperSwap® capability in IBM Spectrum™ Virtualize Software V7.5. HyperSwap delivers high availability (HA) and disaster recovery (DR) in one solution and reuses capital investments to achieve a range of recovery and management options that are transparent to host operations. This book describes how you can use HyperSwap with VMware to create an environment that can withstand robust workloads.

Integrating IBM PureApplication System into an Existing Data Center

This IBM® Redbooks® publication helps you with the integration of IBM PureApplication® System and IBM PureApplication Software into an existing data center. This publication describes certain scenarios that are considered critical (based on IBM client experiences) for a successful implementation of PureApplication Software or PureApplication System into an existing data center. It covers the planning, installation, and configuration of both PureApplication System and PureApplication Software. Both PureApplication System and PureApplication Software offer on-premises solutions that use proven patterns to extend your applications, reduce cost and complexity, and ease management. This book is useful for solution specialists, system or software architects, and the IT teams who need more in-depth knowledge about the integration of PureApplication System and PureApplication Software.

Elasticsearch in Action

Elasticsearch in Action teaches you how to build scalable search applications using Elasticsearch. You'll ramp up fast, with an informative overview and an engaging introductory example. Within the first few chapters, you'll pick up the core concepts you need to implement basic searches and efficient indexing. With the fundamentals well in hand, you'll go on to gain an organized view of how to optimize your design. Perfect for developers and administrators building and managing search-oriented applications. About the Technology Modern search seems like magic'you type a few words and the search engine appears to know what you want. With the Elasticsearch real-time search and analytics engine, you can give your users this magical experience without having to do complex low-level programming or understand advanced data science algorithms. You just install it, tweak it, and get on with your work. About the Book Elasticsearch in Action teaches you how to write applications that deliver professional quality search. As you read, you'll learn to add basic search features to any application, enhance search results with predictive analysis and relevancy ranking, and use saved data from prior searches to give users a custom experience. This practical book focuses on Elasticsearch's REST API via HTTP. Code snippets are written mostly in bash using cURL, so they're easily translatable to other languages. What's Inside What is a great search application? Building scalable search solutions Using Elasticsearch with any language Configuration and tuning About the Reader This book is for developers and administrators building and managing search-oriented applications. About the Authors Radu Gheorghe is a search consultant and software engineer. Matthew Lee Hinman develops highly available, cloud-based systems. Roy Russo is a specialist in predictive analytics. Quotes To understand how a modern search infrastructure works is a daunting task. Radu, Matt, and Roy make it an engaging, hands-on experience. - Sen Xu, Twitter Inc. An indispensable guide to the challenges of search of semi-structured data. - Artur Nowak, Evidence Prime The best resource for a complex topic. Highly recommended. - Daniel Beck, juris GmbH Took me from confused to confident in a week. - Alan McCann, Givsum.com

Streaming Analytics with IBM Streams: Analyze More, Act Faster, and Get Continuous Insights

Gain a competitive edge with IBM Streams Turn data-in-motion into solid business opportunities with IBM Streams and let Streaming Analytics with IBM Streams show you how. This comprehensive guide starts out with a brief overview of different technologies used for big data processing and explanations on how data-in-motion can be utilized for business advantages. You will learn how to apply big data analytics and how they benefit from data-in-motion. Discover all about Streams starting with the main components then dive further with Stream instillation, and upgrade and management capabilities including tools used for production. Through a solid understanding of big in motion, detailed illustrations, Endnotes that provide additional learning resources, and end of chapter summaries with helpful insight, data analysists and professionals looking to get more from their data will benefit from expert insight on: Data-in-motion processing and how it can be applied to generate new business opportunities The three approaches to processing data in motion and pros and cons of each The main components of Streams from runtime to installation and administration Multiple purposes of the Text Analytics toolkit The evolving Streams ecosystem A detailed roadmap for programmers to quickly become fluent with Streams Data-in-motion is rapidly becoming a business tool used to discover more about customers and opportunities, however it is only valuable if have the tools and knowledge to analyze and apply. This is an expert guide to IBM Streams and how you can harness this powerful tool to gain a competitive business edge.

The Little Book of Big Decision Models

Leaders and Managers want quick answers, quick ways to reach solutions, ways and means to access knowledge that won’t eat into their precious time and quick ideas that deliver a big result. The Little Book of Big Decision Models cuts through all the noise and gives managers access to the very best decision-making models that they need to to keep things moving forward. Every model is quick and easy to read and delivers the essential information and know-how quickly, efficiently and memorably.

Building Real-Time Data Pipelines

Traditional data processing infrastructures—especially those that support applications—weren’t designed for our mobile, streaming, and online world. This O’Reilly report examines how today’s distributed, in-memory database management systems (IMDBMS) enable you to make quick decisions based on real-time data. In this report, executives from MemSQL Inc. provide options for using in-memory architectures to build real-time data pipelines. If you want to instantly track user behavior on websites or mobile apps, generate reports on a changing dataset, or detect anomalous activity in your system as it occurs, you’ll learn valuable lessons from some of the largest and most successful tech companies focused on in-memory databases. Explore the architectural principles of modern in-memory databases Understand what’s involved in moving from data silos to real-time data pipelines Run transactions and analytics in a single database, without ETL Minimize complexity by architecting a multipurpose data infrastructure Learn guiding principles for developing an optimally architected operational system Provide persistence and high availability mechanisms for real-time data Choose an in-memory architecture flexible enough to scale across a variety of deployment options Conor Doherty, Data Engineer at MemSQL, is responsible for creating content around database innovation, analytics, and distributed systems. Gary Orenstein, Chief Marketing Officer at MemSQL, leads marketing strategy, product management, communications, and customer engagement. Kevin White is the Director of of Operations and a content contributor at MemSQL. Steven Camiña is a Principal Product Manager at MemSQL. His experience spans B2B enterprise solutions, including databases and middleware platforms.

Oracle Data Integration: Tools for Harnessing Data

Deliver continuous access to timely and accurate BI across your enterprise using the detailed information in this Oracle Press guide. Through clear explanations and practical examples, a team of Oracle experts shows how to assimilate data from disparate sources into a single, unified view. Find out how to transform data in real time, handle replication and migration, and deploy Oracle Data Integrator and Oracle GoldenGate. Oracle Data Integration: Tools for Harnessing Data offers complete coverage of the latest “big data” hardware and software solutions . · Efficiently move data both inside and outside an Oracle environment · Map sources to database fields using Data Merge and ETL · Export schema through transportable tablespaces and Oracle Data Pump · Capture and apply changes across heterogeneous systems with Oracle GoldenGate · Seamlessly exchange information between databases using Oracle Data Integrator · Correct errors and maximize quality through data cleansing and validation · Plan and execute successful Oracle Database migrations and replications · Handle high-volume transactions with Oracle Big Data Appliance, Oracle NoSQL, and third-party utilities

Cassandra Design Patterns - Second Edition

Cassandra Design Patterns is your guide to harnessing the full potential of Apache Cassandra's distributed database capabilities through advanced design practices. Whether you're migrating from an RDBMS or implementing scalable storage for big data, this book provides clear strategies, practical examples, and real-world use cases demonstrating effective design patterns. What this Book will help me do Learn to integrate Cassandra with existing RDBMS solutions, enabling hybrid data architecture. Understand and implement key design patterns for distributed, scalable databases. Master the transition from RDBMS or cache systems to Cassandra with minimal disruption. Dive into time-series and temporal data patterns unique to Cassandra's strengths. Apply learned design patterns directly to real-world big data scenarios for analytics. Author(s) Rajanarayanan Thottuvaikkatumana, the author of Cassandra Design Patterns, is an expert in distributed systems and holds extensive experience in designing and implementing big data solutions. His hands-on approach to Cassandra is evident throughout the book as he bridges theoretical knowledge with practical applications. Rajanarayanan's approachable writing style aims to make complex concepts accessible. Who is it for? This book is ideal for big data developers and system architects who are familiar with the basics of Cassandra and are looking to deepen their understanding of design patterns for robust applications. Readers should have experience with relational databases and desire to migrate or integrate these concepts with NoSQL systems. Whether you're building solutions for data scalability, high availability, or analytics, Cassandra Design Patterns positions itself as an essential resource.

Access 2016 Bible

Master database creation and management Access 2016 Bible is your, comprehensive reference to the world's most popular database management tool. With clear guidance toward everything from the basics to the advanced, this go-to reference helps you take advantage of everything Access 2016 has to offer. Whether you're new to Access or getting started with Access 2016, you'll find everything you need to know to create the database solution perfectly tailored to your needs, with expert guidance every step of the way. The companion website features all examples and databases used in the book, plus trial software and a special offer from Database Creations. Start from the beginning for a complete tutorial, or dip in and grab what you need when you need it — this book gives you an expert Access 2016 companion on call 24/7. Access enables database novices and programmers to store, organize, view, analyze, and share data, as well as build powerful, integrable, custom database solutions — but databases can be complex, and difficult to navigate. This book helps you harness the power of the database with a solid understanding of their purpose, construction, and application. Understand database objects and design systems objects Build forms, create tables, manipulate datasheets, and add data validation Use Visual Basic automation and XML Data Access Page design Exchange data with other Office applications, including Word, Excel, and more From database fundamentals and terminology to XML and Web services, this book has everything you need to maximize Access 2016 and build the database you need.

Access 2016 For Dummies

Your all-access guide to all things Access 2016 If you don't know a relational database from an isolationist table—but still need to figure out how to organize and analyze your data— Access 2016 For Dummies is for you. Written in a friendly and accessible manner, it assumes no prior Access or database-building knowledge and walks you through the basics of creating tables to store your data, building forms that ease data entry, writing queries that pull real information from your data, and creating reports that back up your analysis. Add in a dash of humor and fun, and Access 2016 For Dummies is the only resource you'll need to go from data rookie to data pro! This expanded and updated edition of Access For Dummies covers all of the latest information and features to help data newcomers better understand Access' role in the world of data analysis and data science. Inside, you'll get a crash course on how databases work—and how to build one from the ground up. Plus, you'll find step-by-step guidance on how to structure data to make it useful, manipulate, edit, and import data into your database, write and execute queries to gain insight from your data, and report data in elegant ways. Speak the lingo of database builders and create databases that suit your needs Organize your data into tables and build forms that ease data entry Query your data to get answers right Create reports that tell the story of your data findings If you have little to no experience with creating and managing a database of any sort, Access 2016 For Dummies is the perfect starting point for learning the basics of building databases, simplifying data entry and reporting, and improving your overall data skills.

Integrating the IBM MQ Appliance into your IBM MQ Infrastructure

This IBM® Redbooks® publication describes the IBM MQ Appliance M2000, an application connectivity option that combines secure, reliable IBM MQ messaging with the simplicity and low overall costs of a hardware appliance. This book presents underlying concepts and practical advice for integrating the IBM MQ Appliance M2000 into an IBM MQ infrastructure. Therefore, it is aimed at enterprises that are considering a possible first use of IBM MQ and the IBM MQ Appliance M2000 and those that already identified the appliance as a logical addition to their messaging environment. Details about new functionality and changes in approaches to application messaging are also described. The authors' goal is to help readers make informed design and implementation decisions so that the users can successfully integrate the IBM MQ Appliance M2000 into their environments. A broad understanding of enterprise messaging is required to fully comprehend the details that are provided in this book. Readers are assumed to have at least some familiarity and experience with complimentary IBM messaging products.

Introducing and Implementing IBM FlashSystem V9000

Storage capacity and performance requirements are growing faster than ever before, and the costs of managing this growth are depleting more of the information technology (IT) budget. The IBM® FlashSystem™ V9000 is the premier, fully integrated, Tier 1, all-flash offering from IBM. It has changed the economics of today's data center by eliminating storage bottlenecks. Its software-defined storage features simplify data management, improve data security, and preserve your investments in storage. IBM FlashSystem® V9000 includes IBM FlashCore™ technology and advanced software-defined storage available in one solution in a compact 6U form factor. FlashSystem V9000 improves business application availability. It delivers greater resource utilization so you can get the most from your storage resources, and achieve a simpler, more scalable, and cost-efficient IT Infrastructure. This IBM Redbooks® publication provides information about IBM FlashSystem V9000 Software V7.5 and its new functionality. It describes the product architecture, software, hardware, and implementation, and provides hints and tips. It illustrates use cases and independent software vendor (ISV) scenarios that demonstrate real-world solutions, and also provides examples of the benefits gained by integrating the FlashSystem storage into business environments. Using IBM FlashSystem V9000 software version 7.5 functions, management tools, and interoperability combines the performance of FlashSystem architecture with the advanced functions of software-defined storage to deliver performance, efficiency, and functions that meet the needs of enterprise workloads that demand IBM MicroLatency® response time. This book offers FlashSystem V9000 scalability concepts and guidelines for planning, installing, and configuring, which can help environments scale up and out to add more flash capacity and expand virtualized systems. Port utilization methodologies are provided to help you maximize the full potential of IBM FlashSystem V9000 performance and low latency in your scalable environment. In addition, all of the functions that FlashSystem V9000 software version 7.5 brings are explained, including IBM HyperSwap® capability, increased IBM FlashCopy® bitmap space, Microsoft Windows offloaded data transfer (ODX), and direct 16 gigabits per second (Gbps) Fibre Channel host attach support. This book also describes support for VMware 6, which enhances and improves scalability in a VMware environment. This book is intended for pre-sales and post-sales technical support professionals, storage administrators, and anyone who wants to understand how to implement this exciting technology.

WHOIS Running the Internet: Protocol, Policy, and Privacy

Discusses the evolution of WHOIS and how policy changes will affect WHOIS' place in IT today and in the future This book provides a comprehensive overview of WHOIS. The text begins with an introduction to WHOIS and an in-depth coverage of its forty-year history. Afterwards it examines how to use WHOIS and how WHOIS fits in the overall structure of the Domain Name System (DNS). Other technical topics covered include WHOIS query code and WHOIS server details. The book also discusses current policy developments and implementations, reviews critical policy documents, and explains how they will affect the future of the Internet and WHOIS. Additional resources and content updates will be provided through a supplementary website. Includes an appendix with information on current and authoritative WHOIS services around the world Provides illustrations of actual WHOIS records and screenshots of web-based WHOIS query interfaces with instructions for navigating them Explains network dependencies and processes related to WHOIS utilizing flowcharts Contains advanced coding for programmers WHOIS Running the Internet: Protocol, Policy, and Privacy is written primarily for internet developers, policy developers, industry professionals in law enforcement, digital forensic investigators, and intellectual property attorneys. Garth O. Bruen is an Internet policy and security researcher whose work has been published in the Wall Street Journal and the Washington Post. Since 2012 Garth Bruen has served as the North American At-Large Chair to the Internet Corporation of Assigned Names and Numbers (ICANN). In 2003 Bruen created KnujOn.com with his late father, Dr. Robert Bruen, to process and investigate Internet abuse complaints (SPAM) from consumers. Bruen has trained and advised law enforcement at the federal and local levels on malicious use of the Domain Name System in the way it relates to the WHOIS record system. He has presented multiple times to the High Technology Crime Investigation Association (HTCIA) as well as other cybercrime venues including the Anti-Phishing Working Group (APWG) and the National Center for Justice and the Rule of Law at The University of Mississippi School of Law. Bruen also teaches the Fisher College Criminal Justice School in Boston where he develops new approaches to digital crime.

Real Time Analytics with SAP Hana

"Real Time Analytics with SAP HANA" offers a comprehensive, step-by-step guide to mastering analytics and data modeling in the powerful SAP HANA environment. This book covers everything from basic data modeling concepts to more advanced techniques like creating calculation views and leveraging SAP HANA artifacts. What this Book will help me do Understand and build analytics/data models in the SAP HANA environment. Create schemas, packages, and delivery units in SAP HANA Studio. Master real-time data replication using SLT and SAP HANA Studio. Learn about full-text search, fuzzy search, and other analytical capabilities in SAP HANA. Develop comprehensive use cases combining SAP HANA concepts and tools. Author(s) Vinay Singh, the author of this book, is a seasoned SAP HANA expert with extensive experience in analytics and data modeling. He has worked on multiple SAP HANA implementation and migration projects and brings this expertise into his writing. His practical examples and hands-on approach make SAP HANA concepts accessible to learners at all levels. Who is it for? This book is ideal for SAP HANA data modelers, developers, implementation or migration consultants, project managers, and architects. It is designed for individuals aiming to enhance their skill set in SAP HANA and master real-time analytics. Whether you are actively working with SAP HANA or just starting, this book will serve as a valuable guide.

Web Development with MongoDB and NodeJS - Second Edition

Discover how to build a full-featured, interactive web application from scratch using Node.js and MongoDB in this comprehensive guide. You will learn to set up your development environment, create a web server with Express.js, and integrate MongoDB for data persistence. By the end of this book, you will have the knowledge and skills to develop and deploy robust web applications ready for the cloud. What this Book will help me do Set up a Node.js development environment and connect it to MongoDB. Develop a web server using Express.js and write integrated APIs. Implement dynamic HTML pages leveraging the Handlebars template engine. Build efficient and scalable data-driven features using Mongoose ODM. Deploy web applications seamlessly to cloud platforms like Heroku, AWS, or Azure. Author(s) This book was co-authored by experts None Satheesh, None Joseph D'mello, and Jason Krol, who bring years of experience in software development and expertise in modern web technologies. With a focus on practical application and best practices, the authors aim to empower readers to succeed in real-world development projects using the innovative Node.js and MongoDB stack. Who is it for? This book is tailored for developers who have a basic understanding of JavaScript and HTML and wish to advance their web development skills. If you are motivated to learn how to leverage Node.js and MongoDB for full-stack development or are curious about building and deploying complete web applications, this book is ideal for you. It addresses learners from early career to experienced developers looking to strengthen their skills in modern development stacks.

Advanced Data Management

Advanced data management has always been at the core of efficient database and information systems. Recent trends like big data and cloud computing have aggravated the need for sophisticated and flexible data storage and processing solutions. This book provides a comprehensive coverage of the principles of data management developed in the last decades with a focus on data structures and query languages. It treats a wealth of different data models and surveys the foundations of structuring, processing, storing and querying data according these models. Starting off with the topic of database design, it further discusses weaknesses of the relational data model, and then proceeds to convey the basics of graph data, tree-structured XML data, key-value pairs and nested, semi-structured JSON data, columnar and record-oriented data as well as object-oriented data. The final chapters round the book off with an analysis of fragmentation, replication and consistency strategies for data management in distributed databases as well as recommendations for handling polyglot persistence in multi-model databases and multi-database architectures. While primarily geared towards students of Master-level courses in Computer Science and related areas, this book may also be of benefit to practitioners looking for a reference book on data modeling and query processing. It provides both theoretical depth and a concise treatment of open source technologies currently on the market.