talk-data.com talk-data.com

Topic

data-engineering

3377

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
Beginning SQL Server for Developers, Fourth Edition

Beginning SQL Server for Developers is the perfect book for developers new to SQL Server and planning to create and deploy applications against Microsoft’s market-leading database system for the Windows platform. Now in its fourth edition, the book is enhanced to cover the very latest developments in SQL Server, including the in-memory features that are introduced in SQL Server 2014. Within the book, there are plenty of examples of tasks that developers routinely perform. You’ll learn to create tables and indexes, and be introduced to best practices for securing your valuable data. You’ll learn design tradeoffs and find out how to make sound decisions resulting in scalable databases and maintainable code. SQL Server 2014 introduces in-memory tables and stored procedures. It's now possible to accelerate applications by creating tables (and their indexes) that reside entirely in memory, and never on disk. These new, in-memory structures differ from caching mechanisms of the past, and make possible the extraordinarily swift execution of certain types of queries such as are used in business intelligence applications. Beginning SQL Server for Developers helps you realize the promises of this new feature set while avoiding pitfalls that can occur when mixing in-memory tables and code with traditional, disk-based tables and code. Beginning SQL Server for Developers takes you through the entire database development process, from installing the software to creating a database to writing the code to connect to that database and move data in and out. By the end of the book, you’ll be able to design and create solid and reliable database solutions using SQL Server. Takes you through the entire database application development lifecycle Includes brand new coverage of the in-memory features Introduces the freely-available Express Edition

Enhanced Networking on IBM z/VSE

The importance of modern computer networks is steadily growing as increasing amounts of data are exchanged over company intranets and the Internet. Understanding current networking technologies and communication protocols that are available for the IBM® mainframe and System z® operating systems is essential for setting up your network infrastructure with IBM z/VSE®. This IBM Redbooks® publication helps you install, tailor, and configure new networking options for z/VSE that are available with TCP/IP for VSE/ESA, IPv6/VSE, and Fast Path to Linux on System z (Linux Fast Path). We put a strong focus on network security and describe how the new OpenSSL-based SSL runtime component can be used to enhance the security of your business. This IBM Redbooks publication extends the information that is provided in Security on IBM z/VSE, SG24-7691.

Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset

Many corporations are finding that the size of their data sets are outgrowing the capability of their systems to store and process them. The data is becoming too big to manage and use with traditional tools. The solution: implementing a big data system. As Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset shows, Apache Hadoop offers a scalable, fault-tolerant system for storing and processing data in parallel. It has a very rich toolset that allows for storage (Hadoop), configuration (YARN and ZooKeeper), collection (Nutch and Solr), processing (Storm, Pig, and Map Reduce), scheduling (Oozie), moving (Sqoop and Avro), monitoring (Chukwa, Ambari, and Hue), testing (Big Top), and analysis (Hive). The problem is that the Internet offers IT pros wading into big data many versions of the truth and some outright falsehoods born of ignorance. What is needed is a book just like this one: a wide-ranging but easily understood set of instructions to explain where to get Hadoop tools, what they can do, how to install them, how to configure them, how to integrate them, and how to use them successfully. And you need an expert who has worked in this area for a decade—someone just like author and big data expert Mike Frampton. Big Data Made Easy approaches the problem of managing massive data sets from a systems perspective, and it explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage. It explains, in an easily understood manner and through numerous examples, how to use each tool. The book also explains the sliding scale of tools available depending upon data size and when and how to use them. Big Data Made Easy shows developers and architects, as well as testers and project managers, how to: Store big data Configure big data Process big data Schedule processes Move data among SQL and NoSQL systems Monitor data Perform big data analytics Report on big data processes and projects Test big data systems Big Data Made Easy also explains the best part, which is that this toolset is free. Anyone can download it and—with the help of this book—start to use it within a day. With the skills this book will teach you under your belt, you will add value to your company or client immediately, not to mention your career.

Cassandra High Availability

This book, "Cassandra High Availability", equips you with the knowledge and practical skills to harness Apache Cassandra's capabilities for building resilient, scalable, and highly-available systems. Suitable for developers or DevOps engineers with foundational knowledge of Cassandra, this resource takes you deeper into advanced topics necessary for maintaining robust distributed systems. What this Book will help me do Understand and utilize Cassandra's replication protocols and consistency levels to balance performance and reliability. Configure and manage multi-data-center setups in Cassandra for failover and geographic redundancy. Implement techniques to efficiently scale your Cassandra cluster with no downtime. Learn how to design high-availability data models optimized for performance and resilience. Identify and avoid common anti-patterns in Cassandra to maintain system efficiency and reliability. Author(s) None Strickland, the author of "Cassandra High Availability", is an experienced data engineer with a deep understanding of distributed systems and database technologies. None has worked extensively with Apache Cassandra in designing and optimizing scalable infrastructures. They bring a hands-on and detailed approach to explaining complex topics, making them accessible to both developers and system operators. Who is it for? This book is tailored for developers and DevOps engineers who have foundational knowledge of Apache Cassandra and are aiming to deepen their expertise. If your goal is to design, manage, and optimize high-availability distributed systems, this book provides practical strategies and technical insights for mastering Cassandra's capabilities. Ideal for those seeking to build fault-tolerant, scalable infrastructures.

Mastering Hadoop

Embark on a journey to master Hadoop and its advanced features with this comprehensive book. "Mastering Hadoop" equips you with the knowledge needed to tackle complex data processing challenges and optimize your Hadoop workflows. With clear explanations and practical examples, this book is your guide to becoming proficient in leveraging Hadoop technologies. What this Book will help me do Optimize Hadoop MapReduce jobs, Pig scripts, and Hive queries for better performance. Understand and employ advanced data formats and Hadoop I/O techniques. Learn to integrate low-latency processing with Storm on YARN. Explore the cloud deployment of Hadoop and advanced HDFS alternatives. Enhance Hadoop security and master techniques for analytics using Hadoop. Author(s) None Karanth is an experienced Hadoop professional with years of expertise in data processing and distributed computing. With a practical and methodical approach, None has crafted this book to empower learners with the essentials and advanced features of Hadoop. None's focus on performance optimization and real-world applications helps bridge the gap between theory and practice. Who is it for? This book is ideal for data engineers and software developers familiar with the basics of Hadoop who seek to advance their understanding. If you aim to enhance Hadoop performance or adopt new features like YARN and Storm, this book is for you. Readers interested in Hadoop deployment, optimization, and newer capabilities will also greatly benefit. It's perfect for anyone aiming to become a Hadoop expert, from intermediate learners to advanced practitioners.

IBM i 7.1 Technical Overview with Technology Refresh Updates

This IBM® Redbooks® publication provides a technical overview of the features, functions, and enhancements available in IBM i 7.1, including all the Technology Refresh (TR) levels from TR1 to TR7. It provides a summary and brief explanation of the many capabilities and functions in the operating system. It also describes many of the licensed programs and application development tools that are associated with IBM i. The information provided in this book is useful for clients, IBM Business Partners, and IBM service professionals who are involved with planning, supporting, upgrading, and implementing IBM i 7.1 solutions.

Database Systems

Database Systems: A Pragmatic Approach provides a comprehensive, yet concise introduction to database systems. It discusses the database as an essential component of a software system, as well as a valuable, mission critical corporate resource. The book is based on lecture notes that have been tested and proven over several years, with outstanding results. It also exemplifies mastery of the technique of combining and balancing theory with practice, to give students their best chance at success. Upholding his aim for brevity, comprehensive coverage, and relevance, author Elvis C. Foster's practical and methodical discussion style gets straight to the salient issues, and avoids unnecessary fluff as well as an overkill of theoretical calculations. The book discusses concepts, principles, design, implementation, and management issues of databases. Each chapter is organized systematically into brief, reader-friendly sections, with itemization of the important points to be remembered. It adopts a methodical and pragmatic approach to solving database systems problems. Diagrams and illustrations also sum up the salient points to enhance learning. Additionally, the book includes a number of Foster's original methodologies that add clarity and creativity to the database modeling and design experience while making a novel contribution to the discipline. Everything combines to make Database Systems: A Pragmatic Approach an excellent textbook for students, and an excellent resource on theory for the practitioner.

PostgreSQL: Up and Running, 2nd Edition

Thinking of migrating to PostgreSQL? This clear, fast-paced introduction helps you understand and use this open source database system. Not only will you learn about the enterprise class features in versions 9.2, 9.3, and 9.4, you’ll also discover that PostgeSQL is more than a database system—it’s also an impressive application platform.

SQL Server Integration Services Design Patterns, Second Edition

SQL Server Integration Services Design Patterns is newly-revised for SQL Server 2014, and is a book of recipes for SQL Server Integration Services (SSIS). Design patterns in the book help to solve common problems encountered when developing data integration solutions. The patterns and solution examples in the book increase your efficiency as an SSIS developer, because you do not have to design and code from scratch with each new problem you face. The book's team of expert authors take you through numerous design patterns that you'll soon be using every day, providing the thought process and technical details needed to support their solutions. SQL Server Integration Services Design Patterns goes beyond the surface of the immediate problems to be solved, delving into why particular problems should be solved in certain ways. You'll learn more about SSIS as a result, and you'll learn by practical example. Where appropriate, the book provides examples of alternative patterns and discusses when and where they should be used. Highlights of the book include sections on ETL Instrumentation, SSIS Frameworks, Business Intelligence Markup Language, and Dependency Services. Takes you through solutions to common data integration challenges Provides examples involving Business Intelligence Markup Language Teaches SSIS using practical examples

Sharing Data and Models in Software Engineering

Data Science for Software Engineering: Sharing Data and Models presents guidance and procedures for reusing data and models between projects to produce results that are useful and relevant. Starting with a background section of practical lessons and warnings for beginner data scientists for software engineering, this edited volume proceeds to identify critical questions of contemporary software engineering related to data and models. Learn how to adapt data from other organizations to local problems, mine privatized data, prune spurious information, simplify complex results, how to update models for new platforms, and more. Chapters share largely applicable experimental results discussed with the blend of practitioner focused domain expertise, with commentary that highlights the methods that are most useful, and applicable to the widest range of projects. Each chapter is written by a prominent expert and offers a state-of-the-art solution to an identified problem facing data scientists in software engineering. Throughout, the editors share best practices collected from their experience training software engineering students and practitioners to master data science, and highlight the methods that are most useful, and applicable to the widest range of projects. Shares the specific experience of leading researchers and techniques developed to handle data problems in the realm of software engineering Explains how to start a project of data science for software engineering as well as how to identify and avoid likely pitfalls Provides a wide range of useful qualitative and quantitative principles ranging from very simple to cutting edge research Addresses current challenges with software engineering data such as lack of local data, access issues due to data privacy, increasing data quality via cleaning of spurious chunks in data

A Software Architect's Guide to New Java Workloads in IBM CICS Transaction Server

This IBM® Redpaper Redbooks® publication introduces the IBM System z® New Application License Charges (zNALC) pricing structure and provides examples of zNALC workload scenarios. It describes the products that can be run on a zNALC logical partition (LPAR), reasons to consider such an implementation, and covers the following topics: Using the IBM WebSphere Application Server Liberty profile to host applications within an IBM CICS® environment and how it interacts with CICS applications and resources Security technologies available to applications that are hosted within a WebSphere Application Server Liberty profile in CICS How to implement modern presentation in CICS with a CICS Liberty Java virtual machine (JVM) server How to share scenarios to develop Liberty JVM applications to gain benefits from IBM CICS Transaction Server for IBM z/OS® Value Unit Edition Considerations when using mobile devices to interact with CICS applications and explains specific CICS technologies for connecting mobile devices by using the z/OS Value Unit Edition How IBM Operational Decision Manager for z/OS runs in the transaction server to provide decision management services for CICS COBOL and PL/I applications Installing the CICS Transaction Server for z/OS (CICS TS) Feature Pack for Modern Batch to enable the IBM WebSphere® batch environment to schedule and manage batch applications in CICS This book also covers what is commonly referred to as plain old Java objects (POJOs). The Java virtual machine (JVM) server is a full-fledged JVM that includes support for Open Service Gateway initiative (OSGi) bundles. It can be used to host open source Java frameworks and does just about anything you want to do with Java on the mainframe. POJO applications can also qualify for deployment using the Value Unit Edition. Read about how to configure and deploy them in this companion Redbooks publication: IBM CICS and the JVM server: Developing and Deploying Java Applications, SG24-8038 Examples of POJOs are terminal-initiated transactions, CICS web support, web services, requests received via IP CICS sockets, and messages coming in via IBM WebSphere MQ messaging software.

Beginning Apache Cassandra Development

Beginning Apache Cassandra Development introduces you to one of the most robust and best-performing NoSQL database platforms on the planet. Apache Cassandra is a document database following the JSON document model. It is specifically designed to manage large amounts of data across many commodity servers without there being any single point of failure. This design approach makes Apache Cassandra a robust and easy-to-implement platform when high availability is needed. Apache Cassandra can be used by developers in Java, PHP, Python, and JavaScript—the primary and most commonly used languages. In Beginning Apache Cassandra Development, author and Cassandra expert Vivek Mishra takes you through using Apache Cassandra from each of these primary languages. Mishra also covers the Cassandra Query Language (CQL), the Apache Cassandra analog to SQL. You'll learn to develop applications sourcing data from Cassandra, query that data, and deliver it at speed to your application's users. Cassandra is one of the leading NoSQL databases, meaning you get unparalleled throughput and performance without the sort of processing overhead that comes with traditional proprietary databases. Beginning Apache Cassandra Development will therefore help you create applications that generate search results quickly, stand up to high levels of demand, scale as your user base grows, ensure operational simplicity, and—not least—provide delightful user experiences.

Practical Hadoop Security

Practical Hadoop Security is an excellent resource for administrators planning a production Hadoop deployment who want to secure their Hadoop clusters. A detailed guide to the security options and configuration within Hadoop itself, author Bhushan Lakhe takes you through a comprehensive study of how to implement defined security within a Hadoop cluster in a hands-on way. You will start with a detailed overview of all the security options available for Hadoop, including popular extensions like Kerberos and OpenSSH, and then delve into a hands-on implementation of user security (with illustrated code samples) with both in-the-box features and with security extensions implemented by leading vendors. No security system is complete without a monitoring and tracing facility, so Practical Hadoop Security next steps you through audit logging and monitoring technologies for Hadoop, as well as ready to use implementation and configuration examples--again with illustrated code samples. The book concludes with the most important aspect of Hadoop security – encryption. Both types of encryptions, for data in transit and data at rest, are discussed at length with leading open source projects that integrate directly with Hadoop at no licensing cost. Practical Hadoop Security: Explains importance of security, auditing and encryption within a Hadoop installation Describes how the leading players have incorporated these features within their Hadoop distributions and provided extensions Demonstrates how to set up and use these features to your benefit and make your Hadoop installation secure without impacting performance or ease of use

eXist

Get a head start with eXist, the open source NoSQL database and application development platform built entirely around XML technologies. With this hands-on guide, you’ll learn eXist from the ground up, from using this feature-rich database to work with millions of documents to building complex web applications that take advantage of eXist’s many extensions. If you’re familiar with XML—as a student, professor, publisher, or developer—you’ll find that eXist is ideal for all kinds of documents.

Big Data Now: 2014 Edition

In the four years that O'Reilly Media, Inc. has produced its annual Big Data Now report, the data field has grown from infancy into young adulthood. Data is now a leader in some fields and a driver of innovation in others, and companies that use data and analytics to drive decision-making are outperforming their peers. And while access to big data tools and techniques once required significant expertise, today many tools have improved and communities have formed to share best practices. Companies have also started to emphasize the importance of processes, culture, and people. The topics in represent the major forces currently shaping the data world: Big Data Now: 2014 Edition Cognitive augmentation: predictive APIs, graph analytics, and Network Science dashboards Intelligence matters: defining AI, modeling intelligence, deep learning, and "summoning the demon" Cheap sensors, fast networks, and distributed computing: stream processing, hardware data flows, and computing at the edge Data (science) pipelines: broadening the coverage of analytic pipelines with specialized tools Evolving marketplace of big data components: SSDs, Hadoop 2, Spark; and why datacenters need operating systems Design and social science: human-centered design, wearables and real-time communications, and wearable etiquette Building a data culture: moving from prediction to real-time adaptation; and why you need to become a data skeptic Perils of big data: data redlining, intrusive data analysis, and the state of big data ethics

Augmented Reality Law, Privacy, and Ethics

Augmented Reality (AR) is the blending of digital information in a real-world environment. A common example can be seen during any televised football game, in which information about the game is digitally overlaid on the field as the players move and position themselves. Another application is Google Glass, which enables users to see AR graphics and information about their location and surroundings on the lenses of their "digital eyewear", changing in real-time as they move about. Augmented Reality Law, Privacy, and Ethics is the first book to examine the social, legal, and ethical issues surrounding AR technology. Digital eyewear products have very recently thrust this rapidly-expanding field into the mainstream, but the technology is so much more than those devices. Industry analysts have dubbed AR the "eighth mass medium" of communications. Science fiction movies have shown us the promise of this technology for decades, and now our capabilities are finally catching up to that vision. Augmented Reality will influence society as fundamentally as the Internet itself has done, and such a powerful medium cannot help but radically affect the laws and norms that govern society. No author is as uniquely qualified to provide a big-picture forecast and guidebook for these developments as Brian Wassom. A practicing attorney, he has been writing on AR law since 2007 and has established himself as the world's foremost thought leader on the intersection of law, ethics, privacy, and AR. Augmented Reality professionals around the world follow his Augmented Legality® blog. This book collects and expands upon the best ideas expressed in that blog, and sets them in the context of a big-picture forecast of how AR is shaping all aspects of society. Augmented reality thought-leader Brian Wassom provides you with insight into how AR is changing our world socially, ethically, and legally. Includes current examples, case studies, and legal cases from the frontiers of AR technology. Learn how AR is changing our world in the areas of civil rights, privacy, litigation, courtroom procedure, addition, pornography, criminal activity, patent, copyright, and free speech. An invaluable reference guide to the impacts of this cutting-edge technology for anyone who is developing apps for it, using it, or affected by it in daily life.

Cloud Enabling IBM CICS

This IBM® Redbooks® publication takes an existing IBM 3270-COBOL-VSAM application and describes how to use the features of IBM Customer Information Control System (CICS®) Transaction Server (CICS TS) cloud enablement. Working with the General Insurance Application (GENAPP) as an example, this book describes the steps needed to monitor both platform and application health using the CICS Explorer CICS Cloud perspective. It also shows you how to apply threshold policy and measure resource usage, all without source code changes to the original application. In addition, this book describes how to use multi-versioning to safely and reliably apply and back out application changes. This Redbooks publication includes instructions about the following topics: How to create a CICS TS platform to manage and reflect the health of a set of CICS TS regions, and the services that they provide to applications How to quickly get value from CICS TS applications, by creating and deploying a CICS TS application for an existing user application How to protect your CICS TS platform from erroneous applications by using threshold policies How to deploy and run multiple versions of the same CICS TS application on the same CICS TS platform at the same time, enabling a safer migration from one application version to another, with no downtime How to measure application resource usage, enabling a comparison of the performance of different application versions, and chargeback based on application use

FileMaker® Pro 13 Absolute Beginner’s Guide

Make the most of FileMaker Pro 13– without becoming a technical expert! This book is the fastest way to create FileMaker Pro databases that perform well, are easy to manage, solve problems, and achieve your goals! Even if you’ve never used FileMaker Pro before, you’ll learn how to do what you want, one incredibly clear and easy step at a time. FileMaker Pro has never, ever been this simple! Who knew how simple FileMaker® Pro 13 could be? This is the easiest, most practical beginner’s guide to using the powerful new FileMaker Pro 13 database program…simple, reliable instructions for doing everything you really want to do! Here’s a small sample of what you’ll learn: • Get comfortable with the FileMaker Pro environment, and discover all you can do with it • Create complete databases instantly with Starter Solutions • Design custom databases that efficiently meet your specific needs • Identify the right tables, fields, and relationships; create new databases from scratch • Expand your database to integrate new data and tables • Craft layouts that make your database easier and more efficient to use • Quickly find, sort, organize, import, and export data • Create intuitive, visual reports and graphs for better decision-making • Use scripts to automate a wide variety of routine tasks • Safeguard databases with accounts, privileges, and reliable backups • Share data with colleagues running iPads, iPhones, Windows computers, or Macs • Take your data with you through FileMaker Go • Master expert tips and hidden features you’d never find on your own • And much more…

Google Earth Forensics

Google Earth Forensics is the first book to explain how to use Google Earth in digital forensic investigations. This book teaches you how to leverage Google's free tool to craft compelling location-based evidence for use in investigations and in the courtroom. It shows how to extract location-based data that can be used to display evidence in compelling audiovisual manners that explain and inform the data in contextual, meaningful, and easy-to-understand ways. As mobile computing devices become more and more prevalent and powerful, they are becoming more and more useful in the field of law enforcement investigations and forensics. Of all the widely used mobile applications, none have more potential for helping solve crimes than those with geo-location tools. Written for investigators and forensic practitioners, Google Earth Forensics is written by an investigator and trainer with more than 13 years of experience in law enforcement who will show you how to use this valuable tool anywhere at the crime scene, in the lab, or in the courtroom. Learn how to extract location-based evidence using the Google Earth program or app on computers and mobile devices Covers the basics of GPS systems, the usage of Google Earth, and helps sort through data imported from external evidence sources Includes tips on presenting evidence in compelling, easy-to-understand formats