talk-data.com talk-data.com

Topic

data-engineering

3395

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

3395 activities · Newest first

Building Integrated Websites with IBM Digital Experience

A digital experience is a personalized experience that provides employees, customers, business partners, and citizens with a single point of interaction with people, content, and applications anywhere, anytime, and from any device. The IBM® Digital Experience is a platform that is used to build powerful contextual websites. The strengths of the platform include the ability to mix applications and web content into a coherent user experience. Developers can build upon a prescriptive standard to build reusable building bricks, which can be used by line-of-business (LOB) users in a flexible way. LOB users can assemble pages from these building bricks and from rich web content. The page creation is performed inline by easy drag-and-drop operations without requiring sophisticated IT skills. This IBM Redbooks® publication describes how a team can build a website starting from a new installation of Digital Experience. The book provides examples of the basic tasks that are needed to get started with building a proof-of-concept (PoC) website example. The resulting example website illustrates the value and key capabilities of the Digital Experience suite, featuring IBM WebSphere® Portal and IBM Web Content Management. The target audiences for this book include the following groups: Decision makers and solution architects considering Digital Experience as a platform for their internal or external facing website. Developers who are tasked to implement a PoC and must be enabled to start quickly and efficiently, which includes the integration of existing back-end systems. A wide range of IBM services and sales professionals who are involved in selling IBM software and designing client solutions that include Digital Experience.

Big Data Now: 2015 Edition

Now in its fifth year, O’Reilly’s annual Big Data Now report recaps the trends, tools, applications, and forecasts we’ve talked about over the past year. For 2015, we’ve included a collection of blog posts, authored by leading thinkers and experts in the field, that reflect a unique set of themes we’ve identified as gaining significant attention and traction. Our list of 2015 topics include: Data-driven cultures Data science Data pipelines Big data architecture and infrastructure The Internet of Things and real time Applications of big data Security, ethics, and governance Is your organization on the right track? Get a hold of this free report now and stay in tune with the latest significant developments in big data.

What Is Database Design, Anyway?

Since databases are at the center of the IT world, their proper design would seem to be paramount. And yet, some of the popular references on database design theory and design best practice show a curious lack of understanding by the IT industry at large. In this O’Reilly report, C.J. Date—a prominent researcher and consultant specializing in relational database theory—clarifies exactly what database design is, or ought to be. After providing concise definitions of physical and logical database design, Date dives deeper into the subject of logical design. Specifically, he covers concepts such as table predicate, business rule, uncontrolled redundancy, and consistency. Once you digest this report, you can find more detailed information in Date’s book Database Design and Relational Theory: Normal Forms and All That Jazz (O’Reilly, 2012). C.J. Date has a stature that is unique within the database industry. He is a prolific writer, and is well known for his best-selling textbook An Introduction to Database Systems (Addison Wesley).

MongoDB Cookbook - Second Edition - Second Edition

Designed to help developers and administrators harness the full potential of MongoDB, this book provides clear instruction and practical guidance no matter your level. By exploring both fundamental aspects like installation and configuration, and advanced topics like using cloud services, this book serves as a comprehensive reference for anyone navigating the modern NoSQL database capabilities of MongoDB. What this Book will help me do Understand how to install and configure MongoDB for different environments, enabling efficient setup and operation. Master database administration skills, including monitoring and backup strategies, which are essential for stability and performance. Develop applications with MongoDB using Java and Python, allowing integration into modern tech stacks. Leverage advanced querying and indexing techniques, improving data retrieval and operational efficiency. Integrate MongoDB with cloud platforms and tools like Hadoop, enhancing scalability and expanded use cases. Author(s) None Dasadia and None Nayak are seasoned database professionals with extensive experience in MongoDB and NoSQL database systems. Their practical approach to technical writing focuses on real-world applications and providing solutions to complex challenges. With backgrounds in software development and data management, they ensure that readers have a hands-on learning experience. Their passion for spreading knowledge makes this book both instructional and engaging. Who is it for? This book is ideal for database administrators and software developers interested in adopting or expanding their knowledge of MongoDB. If you're a complete novice or someone with experience who seeks hands-on solutions and examples, this book offers value. It's particularly suited for professionals working with Java or Python, as examples focus on these programming languages. Whether you're enhancing your skills for personal projects or looking to implement MongoDB at work, this resource equips you with the know-how.

Beginning Oracle Application Express 5

Whether you’re new to Oracle or an old hand who’s yet to test the waters of APEX, Beginning Oracle Application Express 5 introduces the processes and best practices you’ll need to become proficient with APEX. The book shows off the programming environment, the utilities and tools available, and then continues by walking through the process of building a working system from the ground up. All code is documented and explained so that those new to the languages will not be lost. After reading this book, power users and programmers alike can quickly put together robust and scalable applications for use by one person, by a department, by an entire company. Beginning Oracle Application Express 5 introduces version 5 of the popular and productive Oracle Application Express development platform. Called APEX for short, the platform enables rapid and easy development of web-based applications that make full use of Oracle Database. The release of APEX 5 brings major new changes to the page builder, an enhanced universal theme, better RESTful web services support, enhanced application packaging, and the many redesigned wizards give a new and fresh feel to the user interface. • Covers brand-new functionality in APEX 5 • Provides fully documented and explained example code • Guides you through creating a working and fully deployable application

IBM TS4500 R2 Tape Library Guide

The IBM® TS4500 tape library is a next-generation tape solution that offers higher storage density and integrated management. This IBM Redbooks® publication gives you a close-up view of the new IBM TS4500 tape library. In the TS4500, IBM delivers the density that today's and tomorrow's data growth require, with the cost-effectiveness and the manageability to grow with business data needs, while you preserve existing investments in IBM tape library products. Now, you can achieve both a low cost per terabyte (TB) and a high TB density per square foot, because the TS4500 can store up to 5.5 petabytes (PBs) of data in a single 10 square foot library frame, which is up to 3.4 times more capacity than the IBM TS3500 tape library. The TS4500 offers these benefits: Flexibility to grow: The TS4500 library can grow from both the right side and the left side of the first L frame because models can be placed in any active position. Increased capacity: The TS4500 can grow from a single L frame up to an additional 17 expansion frames with a capacity of over 23,000 cartridges. Capacity on demand (CoD): CoD is supported through entry-level, intermediate, and base-capacity configurations. Advanced Library Management System (ALMS): ALMS supports dynamic storage management, which enables users to create and change logical libraries and configure any drive for any logical library. Support for the IBM TS1150 tape drive: The TS1150 gives organizations an easy way to deliver fast access to data, improve security, and provide long-term retention, all at a lower cost than disk solutions. The TS1150 offers high-performance, flexible data storage with support for data encryption. Also, this fifth-generation drive can help protect investments in tape automation by offering compatibility with existing automation. Support of the IBM Linear Tape-Open (LTO) Ultrium 7 tape drive: The LTO Ultrium 7 offering represents significant improvements in capacity, performance, and reliability over the previous generation, LTO Ultrium 6, while they still protect your investment in the previous technology. This book describes TS4500 components, feature codes, specifications, supported tape drives, encryption, the new integrated management console, and the command-line interface (CLI). You will learn how to accomplish several specific tasks: Improve storage density with increased expansion frame capacity up to 2.4 times and support 33% more tape drives per frame. Manage storage by using the ALMS feature. Improve business continuity and disaster recovery with automatic control path and data path failover. Help ensure security and regulatory compliance with tape-drive encryption and Write Once Read Many (WORM) media. Support IBM LTO Ultrium 7, 6, and 5, IBM TS1150, and TS1140 tape drives. Provide a flexible upgrade path for users who want to expand their tape storage as their needs grow. Reduce the storage footprint and simplify cabling with 10 U of rack space on top of the library. This guide is for anyone who wants to understand more about the IBM TS4500 tape library. It is particularly suitable for IBM clients, IBM Business Partners, IBM specialist sales representatives, and technical specialists.

Mastering RabbitMQ

Mastering RabbitMQ provides a comprehensive guide to developing and scaling message-driven applications using RabbitMQ. From installation and configuration to advanced concepts like cluster management and high availability, this book equips you with the knowledge and practical skills to ensure robust and efficient messaging solutions. What this Book will help me do Effectively set up and configure RabbitMQ for scalable applications. Implement robust messaging solutions using RabbitMQ's administrative tools. Leverage clustering and high availability for fault-tolerant systems. Create custom RabbitMQ plugins using Erlang for extended capabilities. Monitor and secure RabbitMQ environments leveraging popular tools and practices. Author(s) Yusuf Aytas and the co-authors bring a wealth of technical knowledge and practical experience in messaging systems and RabbitMQ. With years of expertise in software engineering and hands-on implementations in enterprise environments, they provide insights that combine theory with real-world application. Who is it for? This book is ideal for developers and IT professionals with basic knowledge of RabbitMQ seeking to master advanced usage and best practices. It caters to those looking to design, develop, and maintain robust messaging systems efficiently. Perfect for those with a background in message queuing, it provides an in-depth understanding to take your expertise to the next level.

Scalable Big Data Architecture: A Practitioner’s Guide to Choosing Relevant Big Data Architecture

This book highlights the different types of data architecture and illustrates the many possibilities hidden behind the term "Big Data", from the usage of No-SQL databases to the deployment of stream analytics architecture, machine learning, and governance. Scalable Big Data Architecture covers real-world, concrete industry use cases that leverage complex distributed applications , which involve web applications, RESTful API, and high throughput of large amount of data stored in highly scalable No-SQL data stores such as Couchbase and Elasticsearch. This book demonstrates how data processing can be done at scale from the usage of NoSQL datastores to the combination of Big Data distribution. When the data processing is too complex and involves different processing topology like long running jobs, stream processing, multiple data sources correlation, and machine learning, it’s often necessary to delegate the load to Hadoop or Spark and use the No-SQL to serve processed data in real time. This book shows you how to choose a relevant combination of big data technologies available within the Hadoop ecosystem. It focuses on processing long jobs, architecture, stream data patterns, log analysis, and real time analytics. Every pattern is illustrated with practical examples, which use the different open sourceprojects such as Logstash, Spark, Kafka, and so on. Traditional data infrastructures are built for digesting and rendering data synthesis and analytics from large amount of data. This book helps you to understand why you should consider using machine learning algorithms early on in the project, before being overwhelmed by constraints imposed by dealing with the high throughput of Big data. Scalable Big Data Architecture is for developers, data architects, and data scientists looking for a better understanding of how to choose the most relevant pattern for a Big Data project and which tools to integrate into that pattern.

Asset Accounting Configuration in SAP ERP: A Step-by-Step Guide

In this book, noted expert Andrew Okungbowa explains SAP Asset Accounting (FI-AA) in SAP-ERP, including its associated business benefits, and guides you through the considerable complexities of SAP-ERP configuration. Using FI-AA for fixed asset management enables you to manage assets in multinational companies across a broad range of industries and produce reports to meet various needs in line with legal requirements. Configuring SAP-ERP can be a daunting exercise, however, and there are few resources that address these issues. Asset Accounting Configuration in SAP ERP fills that resource gap by covering the major aspects of SAP FI-AA for anyone with SAP experience and the basic accounting knowledge and bookkeeping skills necessary to apply configuration. It provides configuration explanations in the simplest forms possible and provides step-by-step guidance with illustrations and practical examples.

Beginning Neo4j

This book is your introduction in the world of graph databases, and the benefits they can bring to your applications. Neo4j is the most established graph database on the market, and it's always improving to bring more of its benefits to you. Beginning Neo4j will take you from the installation of Neo4j through to building a full application with Neo4j at its heart, and everything in between. Using this book, you'll get everything up and running, and then learn how to use Neo4j to build up recommendations, relationships, and calculate the shortest route between two locations. With example data models, and an application putting everything together, this book will give you everything you need to really get started with Neo4j. Neo4j is being used by social media and ecommerce industry giants. You can take advantage of Neo4j's powerful features and benefits - add Beginning Neo4j to your library today.

Big Data Analytics with Spark: A Practitioner’s Guide to Using Spark for Large-Scale Data Processing, Machine Learning, and Graph Analytics, and High-Velocity Data Stream Processing

This book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX, MLlib, and Spark ML. Big Data Analytics with Spark shows you how to use Spark and leverage its easy-to-use features to increase your productivity. You learn to perform fast data analysis using its in-memory caching and advanced execution engine, employ in-memory computing capabilities for building high-performance machine learning and low-latency interactive analytics applications, and much more. Moreover, the book shows you how to use Spark as a single integrated platform for a variety of data processing tasks, including ETL pipelines, BI, live data stream processing, graph analytics, and machine learning. The book also includes a chapter on Scala, the hottest functional programming language, and the language that underlies Spark. You’ll learn the basics of functional programming in Scala, so that you can write Spark applications in it. What's more, Big Data Analytics with Spark provides an introduction to other big data technologies that are commonly used along with Spark, such as HDFS, Avro, Parquet, Kafka, Cassandra, HBase, Mesos, and so on. It also provides an introduction to machine learning and graph concepts. So the book is self-sufficient; all the technologies that you need to know to use Spark are covered. The only thing that you are expected to have is some programming knowledge in any language.

Next Generation Databases: NoSQL, NewSQL, and Big Data

This is a book for enterprise architects, database administrators, and developers who need to understand the latest developments in database technologies. It is the book to help you choose the correct database technology at a time when concepts such as Big Data, NoSQL and NewSQL are making what used to be an easy choice into a complex decision with significant implications. The relational database (RDBMS) model completely dominated database technology for over 20 years. Today this "one size fits all" stability has been disrupted by a relatively recent explosion of new database technologies. These paradigm-busting technologies are powering the "Big Data" and "NoSQL" revolutions, as well as forcing fundamental changes in databases across the board. Deciding to use a relational database was once truly a no-brainer, and the various commercial relational databases competed on price, performance, reliability, and ease of use rather than on fundamental architectures. Today we are faced with choices between radically different database technologies. Choosing the right database today is a complex undertaking, with serious economic and technological consequences. Next Generation Databases demystifies today’s new database technologies. The book describes what each technology was designed to solve. It shows how each technology can be used to solve real word application and business problems. Most importantly, this book highlights the architectural differences between technologies that are the critical factors to consider when choosing a database platform for new and upcoming projects. Introduces the new technologies that have revolutionized the database landscape Describes how each technology can be used to solve specific application or business challenges Reviews the most popular new wave databases and how they use these new database technologies

SAP Project Management Pitfalls: How to Avoid the Most Common Pitfalls of an SAP Solution

Master the SAP product ecosystem, the client environment, and the feasibility of implementing critical business process with the required technical and functional configuration. SAP Project Management Pitfalls is the first book to provide you with real examples of the pitfalls that you can avoid, providing you with a road-map to a successful implementation. Jay Kay, a SAP Program Manager for Capgemini, first takes a deep dive into common pitfalls in implementing SAP ERP projects in a complex IT landscape. You will learn about the potential causes of failures, study a selection of relevant project implementation case studies in the area, and see a range of possible countermeasures. Jay Kay also provides background on each - the significance of each implementation area, its relevance to a service company that implements SAP projects, and the current state of research. Key highlights of the book: Tools and techniques for project planning and templates for allocating resources Industry standards and innovations in SAP implementation projects in the form of standard solutions aimed at successful implementation Managing SAP system ECC upgrades, EHP updates and project patches Learn effective ways to implement robust SAP release management practices (change management, BAU) Wearing a practitioner’s insight, Jay Kay explores the relevance of each failed implementation scenario and how to support your company or clients to succeed in a SAP implementation. There are many considerations when implementing SAP, but as you will learn, knowledge, insight, and effective tools to mitigate risks can take you to a successful implementation project.

Learning Geospatial Analysis with Python-Second Edition

Dive into the world of geospatial analysis with Python in this comprehensive guide. This book will take you through the essentials of GIS, remote sensing, elevation data, and real-time data, all using Python. You will learn to analyze and visualize geospatial data effectively, building skills and understanding that are practical and relevant. What this Book will help me do Automate geospatial workflows using Python. Create thematic maps with Python tools for better spatial insights. Understand various forms of geospatial data and how to manage them. Develop GIS applications and elevation data models using minimal lines of Python code. Utilize Python for real-time data tracking and storm modeling. Author(s) Joel Lawhead is an experienced geospatial software developer and Python programmer with extensive expertise in GIS and geospatial data analysis. With a deep understanding of Python's applications in geography, Joel brings a practical focus to his writing. His engaging style ensures that technical concepts are accessible and thoroughly explained. Who is it for? This book is ideal for Python developers, researchers, and analysts who want to enhance their GIS and geospatial analysis capabilities. If you are familiar with Python or another scripting language and have a foundational understanding of digital mapping, this book will help you advance your knowledge and skills. Whether you're analyzing spatial data or building geospatial applications, this guide is made for you.

Elasticsearch Indexing

Elasticsearch Indexing focuses on empowering developers to create optimized and user-friendly search experiences using Elasticsearch. By learning how to configure indices and mapping strategies, and leveraging analyzers effectively, you will gain proficiency in delivering fast and relevant search results tailored to modern user expectations. What this Book will help me do Understand how Elasticsearch stores data and how it reduces costs Develop advanced mapping strategies to improve index performance Utilize Elasticsearch analyzers for efficient search query processing Optimize ElasticSearch clusters for scalability and stability Perform strategic indexing to minimize resource usage while maximizing functionality Author(s) Huseyin Akdogan is a seasoned software developer specializing in search technologies and scalability. With his deep expertise and practical insights, he brings a metric-driven approach to optimizing Elasticsearch. His book reflects his dedication to making technical concepts accessible and actionable for developers. Who is it for? This book is ideal for developers looking to gain expertise in Elasticsearch. It caters to individuals with a foundational understanding of search systems who wish to optimize their indexing and search result delivery. If you are focused on improving user search experiences tailored to scalable needs, this book is perfect for you.

Apache Solr: A Practical Approach to Enterprise Search

Build an enterprise search engine using Apache Solr: index and search documents; ingest data from varied sources; apply various text processing techniques; utilize different search capabilities; and customize Solr to retrieve the desired results. Apache Solr: A Practical Approach to Enterprise Search explains each essential concept--backed by practical and industry examples--to help you attain expert-level knowledge. The book, which assumes a basic knowledge of Java, starts with an introduction to Solr, followed by steps to setting it up, indexing your first set of documents, and searching them. It then introduces you to information retrieval and its implementation in Apache Solr; this will help you understand your search problem, decide the approach to build an effective solution, and use various metrics to evaluate the results. The book next covers the schema design and techniques to build a text analysis chain for cleansing, normalizing and enriching your documents and addressing different types of search queries. It describes various popular matching techniques which are generally applied to improve the precision and recall of searches. You will learn the end-to-end process of data ingestion from varied sources, metadata extraction, pre-processing and transformation of content, various search components, query parsers and other advanced search capabilities. After covering out-of-the-box features, Solr expert Dikshant Shahi dives into ways you can customize Solr for your business and its specific requirements, along with ways to plug in your own components. Most important, you will learn about implementations for Solr scoring, factors affecting the document score, and tuning the score for the application at hand. The book explains why textual scoring is not sufficient for practical ranking of documents and ways to integrate real-world factors for contributing to the document ranking. You'll see how to influence user experience by providing suggestions and recommendations. You'll also see integration of Solr with important related technologies such as OpenNLP and Tika. Additionally, you will learn about scaling Solr using SolrCloud. This book concludes with coverage of semantic search capabilities, which is crucial for taking the search experience to the next level. By the end of Apache Solr, you will be proficient in designing and developing your search engine.

Apache Solr for Indexing Data

Dive into the world of Apache Solr with this focused guide on indexing. Learn to harness Solr's powerful indexing features through real-world examples that help you efficiently fetch relevant data. From setting up Solr to mastering advanced techniques, this book ensures you are equipped to program and optimize Solr for a variety of applications. What this Book will help me do Gain a solid understanding of Solr indexing, analyzers, and tokenizers. Index diverse data sources, such as databases and documents, using Solr. Utilize Apache Tika to process PDFs and integrate Apache Nutch for web crawling. Implement distributed data indexing and facilitate real-time index updates. Apply learned indexing techniques to create robust solutions like e-commerce applications. Author(s) None Handiekar and None Johri are seasoned Apache Solr experts with years of practical experience. Having worked on diverse projects, they bring an enriched understanding of indexing and optimization. Their writing reflects a hands-on approach aimed at developers keen on mastering Solr effectively and efficiently. Who is it for? This book is designed for developers who are new to Solr yet aspire to deepen their understanding of its indexing capabilities. If you are interested in learning how to index using Solr's rich set of functionalities, this guide is for you. It provides practical steps for developers at an introductory level to start building more complex solutions. Perfect for developers eager to optimize their Solr indexing.

Learning RabbitMQ

Learning RabbitMQ offers developers and system administrators a clear and practical guide to mastering RabbitMQ, the popular message-broker solution. By going through concrete scenarios and examples, this book equips you with the skills to configure, manage, and optimize RabbitMQ instances effectively. What this Book will help me do Understand and apply common messaging patterns using RabbitMQ. Set up and manage RabbitMQ clusters with high availability. Integrate RabbitMQ with popular tools like Spring, MuleESB, and Docker. Optimize RabbitMQ performance and ensure secure messaging. Troubleshoot and extend RabbitMQ for different use cases. Author(s) None Toshev, the author of Learning RabbitMQ, is an expert in middleware and distributed systems, with extensive hands-on experience working with message brokers. Toshev's practical approach and clear explanations make complex topics easy to understand, helping readers effectively apply best practices when using RabbitMQ. Who is it for? This book is for developers and system administrators who aim to incorporate RabbitMQ as part of their applications. Beginners with a basic understanding of messaging will find it foundational, while experienced users will appreciate the advanced insights on integration, performance tuning, and troubleshooting.

XQuery, 2nd Edition

The W3C XQuery 3.1 standard provides a tool to search, extract, and manipulate content, whether it's in XML, JSON or plain text. With this fully updated, in-depth tutorial, you’ll learn to program with this highly practical query language. Designed for query writers who have some knowledge of XML basics, but not necessarily advanced knowledge of XML-related technologies, this book is ideal as both a tutorial and a reference. You’ll find background information for namespaces, schemas, built-in types, and regular expressions that are relevant to writing XML queries.