talk-data.com talk-data.com

Topic

Data Collection

18

tagged

Activity Trend

17 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Engineering Books ×
Grokking Relational Database Design

A friendly illustrated guide to designing and implementing your first database. Grokking Relational Database Design makes the principles of designing relational databases approachable and engaging. Everything in this book is reinforced by hands-on exercises and examples. In Grokking Relational Database Design, you’ll learn how to: Query and create databases using Structured Query Language (SQL) Design databases from scratch Implement and optimize database designs Take advantage of generative AI when designing databases A well-constructed database is easy to understand, query, manage, and scale when your app needs to grow. In Grokking Relational Database Design you’ll learn the basics of relational database design including how to name fields and tables, which data to store where, how to eliminate repetition, good practices for data collection and hygiene, and much more. You won’t need a computer science degree or in-depth knowledge of programming—the book’s practical examples and down-to-earth definitions are beginner-friendly. About the Technology Almost every business uses a relational database system. Whether you’re a software developer, an analyst creating reports and dashboards, or a business user just trying to pull the latest numbers, it pays to understand how a relational database operates. This friendly, easy-to-follow book guides you from square one through the basics of relational database design. About the Book Grokking Relational Database Design introduces the core skills you need to assemble and query tables using SQL. The clear explanations, intuitive illustrations, and hands-on projects make database theory come to life, even if you can’t tell a primary key from an inner join. As you go, you’ll design, implement, and optimize a database for an e-commerce application and explore how generative AI simplifies the mundane tasks of database designs. What's Inside Define entities and their relationships Minimize anomalies and redundancy Use SQL to implement your designs Security, scalability, and performance About the Reader For self-taught programmers, software engineers, data scientists, and business data users. No previous experience with relational databases assumed. About the Authors Dr. Qiang Hao and Dr. Michail Tsikerdekis are both professors of Computer Science at Western Washington University. Quotes If anyone is looking to improve their database design skills, they can’t go wrong with this book. - Ben Brumm, DatabaseStar Goes beyond SQL syntax and explores the core principles. An invaluable resource! - William Jamir Silva, Adjust Relational database design is best done right the first time. This book is a great help to achieve that! - Maxim Volgin, KLM Provides necessary notions to design and build databases that can stand the data challenges we face. - Orlando Méndez, Experian

IAPP CIPP / US Certified Information Privacy Professional Study Guide, 2nd Edition

Prepare for success on the IAPP CIPP/US exam and further your career in privacy with this effective study guide - now includes a downloadable supplement to get you up to date on the current CIPP exam for 2024-2025! Information privacy has become a critical and central concern for small and large businesses across the United States. At the same time, the demand for talented professionals able to navigate the increasingly complex web of legislation and regulation regarding privacy continues to increase. Written from the ground up to prepare you for the United States version of the Certified Information Privacy Professional (CIPP) exam, Sybex's IAPP CIPP/US Certified Information Privacy Professional Study Guide also readies you for success in the rapidly growing privacy field. You'll efficiently and effectively prepare for the exam with online practice tests and flashcards as well as a digital glossary. The concise and easy-to-follow instruction contained in the IAPP/CIPP Study Guide covers every aspect of the CIPP/US exam, including the legal environment, regulatory enforcement, information management, private sector data collection, law enforcement and national security, workplace privacy and state privacy law, and international privacy regulation. Provides the information you need to gain a unique and sought-after certification that allows you to fully understand the privacy framework in the US Fully updated to prepare you to advise organizations on the current legal limits of public and private sector data collection and use Includes 1 year free access to the Sybex online learning center, with chapter review questions, full-length practice exams, hundreds of electronic flashcards, and a glossary of key terms, all supported by Wiley's support agents who are available 24x7 via email or live chat to assist with access and login questions Perfect for anyone considering a career in privacy or preparing to tackle the challenging IAPP CIPP exam as the next step to advance an existing privacy role, the IAPP CIPP/US Certified Information Privacy Professional Study Guide offers you an invaluable head start for success on the exam and in your career as an in-demand privacy professional.

Data Engineering and Data Science

DATA ENGINEERING and DATA SCIENCE Written and edited by one of the most prolific and well-known experts in the field and his team, this exciting new volume is the “one-stop shop” for the concepts and applications of data science and engineering for data scientists across many industries. The field of data science is incredibly broad, encompassing everything from cleaning data to deploying predictive models. However, it is rare for any single data scientist to be working across the spectrum day to day. Data scientists usually focus on a few areas and are complemented by a team of other scientists and analysts. Data engineering is also a broad field, but any individual data engineer doesn’t need to know the whole spectrum of skills. Data engineering is the aspect of data science that focuses on practical applications of data collection and analysis. For all the work that data scientists do to answer questions using large sets of information, there have to be mechanisms for collecting and validating that information. In this exciting new volume, the team of editors and contributors sketch the broad outlines of data engineering, then walk through more specific descriptions that illustrate specific data engineering roles. Data-driven discovery is revolutionizing the modeling, prediction, and control of complex systems. This book brings together machine learning, engineering mathematics, and mathematical physics to integrate modeling and control of dynamical systems with modern methods in data science. It highlights many of the recent advances in scientific computing that enable data-driven methods to be applied to a diverse range of complex systems, such as turbulence, the brain, climate, epidemiology, finance, robotics, and autonomy. Whether for the veteran engineer or scientist working in the field or laboratory, or the student or academic, this is a must-have for any library.

Unlocking the Value of Real-Time Analytics

Storing data and making it accessible for real-time analysis is a huge challenge for organizations today. In 2020 alone, 64.2 billion GB of data was created or replicated, and it continues to grow. With this report, data engineers, architects, and software engineers will learn how to do deep analysis and automate business decisions while keeping your analytical capabilities timely. Author Christopher Gardner takes you through current practices for extracting data for analysis and uncovers the opportunities and benefits of making that data extraction and analysis continuous. By the end of this report, you’ll know how to use new and innovative tools against your data to make real-time decisions. And you’ll understand how to examine the impact of real-time analytics on your business. Learn the four requirements of real-time analytics: latency, freshness, throughput, and concurrency Determine where delays between data collection and actionable analytics occur Understand the reasons for real-time analytics and identify the tools you need to reach a faster, more dynamic level Examine changes in data storage and software while learning methodologies for overcoming delays in existing database architecture Explore case studies that show how companies use columnar data, sharding, and bitmap indexing to store and analyze data Fast and fresh data can make the difference between a successful transaction and a missed opportunity. The report shows you how.

Building an Anonymization Pipeline

How can you use data in a way that protects individual privacy but still provides useful and meaningful analytics? With this practical book, data architects and engineers will learn how to establish and integrate secure, repeatable anonymization processes into their data flows and analytics in a sustainable manner. Luk Arbuckle and Khaled El Emam from Privacy Analytics explore end-to-end solutions for anonymizing device and IoT data, based on collection models and use cases that address real business needs. These examples come from some of the most demanding data environments, such as healthcare, using approaches that have withstood the test of time. Create anonymization solutions diverse enough to cover a spectrum of use cases Match your solutions to the data you use, the people you share it with, and your analysis goals Build anonymization pipelines around various data collection models to cover different business needs Generate an anonymized version of original data or use an analytics platform to generate anonymized outputs Examine the ethical issues around the use of anonymized data

Data Privacy and GDPR Handbook

The definitive guide for ensuring data privacy and GDPR compliance Privacy regulation is increasingly rigorous around the world and has become a serious concern for senior management of companies regardless of industry, size, scope, and geographic area. The Global Data Protection Regulation (GDPR) imposes complex, elaborate, and stringent requirements for any organization or individuals conducting business in the European Union (EU) and the European Economic Area (EEA)—while also addressing the export of personal data outside of the EU and EEA. This recently-enacted law allows the imposition of fines of up to 5% of global revenue for privacy and data protection violations. Despite the massive potential for steep fines and regulatory penalties, there is a distressing lack of awareness of the GDPR within the business community. A recent survey conducted in the UK suggests that only 40% of firms are even aware of the new law and their responsibilities to maintain compliance. The Data Privacy and GDPR Handbook helps organizations strictly adhere to data privacy laws in the EU, the USA, and governments around the world. This authoritative and comprehensive guide includes the history and foundation of data privacy, the framework for ensuring data privacy across major global jurisdictions, a detailed framework for complying with the GDPR, and perspectives on the future of data collection and privacy practices. Comply with the latest data privacy regulations in the EU, EEA, US, and others Avoid hefty fines, damage to your reputation, and losing your customers Keep pace with the latest privacy policies, guidelines, and legislation Understand the framework necessary to ensure data privacy today and gain insights on future privacy practices The Data Privacy and GDPR Handbook is an indispensable resource for Chief Data Officers, Chief Technology Officers, legal counsel, C-Level Executives, regulators and legislators, data privacy consultants, compliance officers, and audit managers.

Perspectives on Data Science for Software Engineering

Perspectives on Data Science for Software Engineering presents the best practices of seasoned data miners in software engineering. The idea for this book was created during the 2014 conference at Dagstuhl, an invitation-only gathering of leading computer scientists who meet to identify and discuss cutting-edge informatics topics. At the 2014 conference, the concept of how to transfer the knowledge of experts from seasoned software engineers and data scientists to newcomers in the field highlighted many discussions. While there are many books covering data mining and software engineering basics, they present only the fundamentals and lack the perspective that comes from real-world experience. This book offers unique insights into the wisdom of the community’s leaders gathered to share hard-won lessons from the trenches. Ideas are presented in digestible chapters designed to be applicable across many domains. Topics included cover data collection, data sharing, data mining, and how to utilize these techniques in successful software projects. Newcomers to software engineering data science will learn the tips and tricks of the trade, while more experienced data scientists will benefit from war stories that show what traps to avoid. Presents the wisdom of community experts, derived from a summit on software analytics Provides contributed chapters that share discrete ideas and technique from the trenches Covers top areas of concern, including mining security and social data, data visualization, and cloud-based data Presented in clear chapters designed to be applicable across many domains

Manufacturing Performance Management using SAP OEE: Implementing and Configuring Overall Equipment Effectiveness

Learn how to configure, implement, enhance, and customize SAP OEE to address manufacturing performance management. Manufacturing Performance Management using SAP OEE will show you how to connect your business processes with your plant systems and how to integrate SAP OEE with ERP through standard workflows and shop floor systems for automated data collection. Manufacturing Performance Management using SAP OEE is a must-have comprehensive guide to implementing SAP OEE. It will ensure that SAP consultants and users understand how SAP OEE can offer solutions for manufacturing performance management in process industries. With this book in hand, managing shop floor execution effectively will become easier than ever. Authors Dipankar Saha and Mahalakshmi Symsunder, both SAP manufacturing solution experts, and Sumanta Chakraborty, product owner of SAP OEE, will explain execution and processing related concepts, manual and automatic data collection through the OEE Worker UI, and how to enhance and customize interfaces and dashboards for your specific purposes. You'll learn how to capture and categorize production and loss data and use it effectively for root-cause analysis. In addition, this book will show you: Various down-time handling scenarios. How to monitor, calculate, and define standard as well as industry-specific KPIs. How to carry out standard operational analytics for continuous improvement on the shop floor, at local plant level using MII and SAP Lumira, and also global consolidated analytics at corporation level using SAP HANA. Steps to benchmark manufacturing performance to compare similar manufacturing plants' performance, leading to a more efficient and effective shop floor. Manufacturing Performance Management using SAP OEE will provide you with in-depth coverage of SAP OEE and how to effectively leverage its features. This will allow you to efficiently manage the manufacturing process and to enhance the shop floor's overall performance, making you the sought-after SAP OEE expert in the organization. Manufacturing Performance Management using SAP OEE will provide you with in-depth coverage of SAP OEE and how to effectively leverage its features. This will allow you to efficiently manage the manufacturing process and to enhance the shop floor's overall performance, making you the sought-after SAP OEE expert in the organization. What You Will Learn Configure your ERP OEE add-on to build your plant and global hierarchy and relevant master data and KPIs Use the SAP OEE standard integration (SAP OEEINT) to integrate your ECC and OEE system to establish bi-directional integration between the enterprise and the shop floor Enable your shop floor operator on the OEE Worker UI to handle shop floor production execution Use SAP OEE as a tool for measuring manufacturing performance Enhance and customize SAP OEE to suit your specific requirements Create local plant-based reporting using SAP Lumira and MII Use standard SAP OEE HANA analytics Who This Book Is For SAP MII, ME, and OEE consultants and users who will implement and use the solution.

Microsoft Mapping: Geospatial Development in Windows 10 with Bing Maps and C#, Second Edition

This revised edition of Microsoft Mapping includes the latest details about SQL Server 2014 and the new 3D and Streetside-capable map control for Windows 10 applications. It contains updated chapters on Microsoft Azure and Power Map for Excel plus a new chapter on Bing Maps for Universal Windows. The book tells a story, from beginning to end, of planning and deploying a single geospatial application built using Microsoft technologies from end-to-end. Readers are expected to have basic familiarity with the fundamentals of developing for Microsoft platforms (some understanding of basic SQL, C#, .NET, and WCF); as readers work through the book they will build on their existing skills so that they will be able to deploy geospatial applications for social networking, data collection, enterprise management, or other purposes.

IBM Tivoli Storage Productivity Center Beyond the Basics

You have installed and performed the basic customization of IBM® Tivoli® Storage Productivity Center. You have collected performance data collection and generated reports. Now it’s time to learn the best ways to use the software to manage your storage infrastructure. This IBM Redbooks® publication shows the best way to set up the software, based on your storage environment, and then how to use it to manage your infrastructure. It includes experiences from IBM clients and staff and covers the following topics: Architectural design techniques (sizing your environment, single versus multiple installations, physical versus virtual servers, deployment in a large, existing storage infrastructure) Database and server considerations (database backup and restoration methods and scripts, using IBM Data Studio Client for database administration, database placement and relocation, repository sizing and tuning, moving and migrating the server) Alerting, monitoring and reporting (monitoring thresholds and alerts, performance management and analysis of reports, real-time performance monitoring for IBM SAN Volume Controller) Security considerations (Tivoli Storage Productivity Center internal user IDs, user authentication configuration methods, how and why to set up and change passwords, configuring, querying, and testing LDAP and Microsoft Active Directory) Heath checks (server heath and logs, health and recoverability of IBM DB2® databases, using the Database Maintenance tool) Data management techniques (how to spot unusual growth incidents, scripted actions for Tivoli Storage manager and hierarchical storage management) This book is for storage administrators who are responsible for the performance and growth of the IT storage infrastructure.

Microsoft SQL Server 2014 Query Tuning & Optimization

Optimize Microsoft SQL Server 2014 queries and applications Microsoft SQL Server 2014 Query Tuning & Optimization is filled with ready-to-use techniques for creating high-performance queries and applications. The book describes the inner workings of the query processor so you can write better queries and provide the query processor with the quality information it needs to produce efficient execution plans. You’ll also get tips for troubleshooting underperforming queries. In-Memory OLTP (Hekaton), a key new feature of SQL Server 2014, is fully covered in this practical guide. Understand how the query optimizer works Troubleshoot queries using extended events, SQL trace, dynamic management views (DMVs), the data collector, and other tools Work with query operators for data access, joins, aggregations, parallelism, and updates Speed up queries and dramatically improve application performance by creating the right indexes Understand statistics and how to detect and fix cardinality estimation errors Maximize OLTP query performance using In-Memory OLTP (Hekaton) features, including memory-optimized tables and natively compiled stored procedures Monitor and promote plan caching and reuse to improve application performance Improve the performance of data warehouse queries using columnstore indexes Handle query processor limitations with hints and other methods

Internet and Surveillance

The Internet has been transformed in the past years from a system primarily oriented on information provision into a medium for communication and community-building. The notion of “Web 2.0”, social software, and social networking sites such as Facebook, Twitter and MySpace have emerged in this context. With such platforms comes the massive provision and storage of personal data that are systematically evaluated, marketed, and used for targeting users with advertising. In a world of global economic competition, economic crisis, and fear of terrorism after 9/11, both corporations and state institutions have a growing interest in accessing this personal data. Here, contributors explore this changing landscape by addressing topics such as commercial data collection by advertising, consumer sites and interactive media; self-disclosure in the social web; surveillance of file-sharers; privacy in the age of the internet; civil watch-surveillance on social networking sites; and networked interactive surveillance in transnational space. This book is a result of a research action launched by the intergovernmental network COST (European Cooperation in Science and Technology).

Logging and Log Management

Logging and Log Management: The Authoritative Guide to Understanding the Concepts Surrounding Logging and Log Management introduces information technology professionals to the basic concepts of logging and log management. It provides tools and techniques to analyze log data and detect malicious activity. The book consists of 22 chapters that cover the basics of log data; log data sources; log storage technologies; a case study on how syslog-ng is deployed in a real environment for log collection; covert logging; planning and preparing for the analysis log data; simple analysis techniques; and tools and techniques for reviewing logs for potential problems. The book also discusses statistical analysis; log data mining; visualizing log data; logging laws and logging mistakes; open source and commercial toolsets for log data collection and analysis; log management procedures; and attacks against logging systems. In addition, the book addresses logging for programmers; logging and compliance with regulations and policies; planning for log analysis system deployment; cloud logging; and the future of log standards, logging, and log analysis. This book was written for anyone interested in learning more about logging and log management. These include systems administrators, junior security engineers, application developers, and managers. Comprehensive coverage of log management including analysis, visualization, reporting and more Includes information on different uses for logs -- from system operations to regulatory compliance Features case Studies on syslog-ng and actual real-world situations where logs came in handy in incident response Provides practical guidance in the areas of report, log analysis system selection, planning a log analysis system and log data normalization and correlation

Professional SQL Server® 2008 Internals and Troubleshooting

A hands-on resource for SQL Server 2008 troubleshooting methods and tools SQL Server administrators need to ensure that SQL Server remains running 24/7. Authored by leading SQL Server experts and MVPs, this book provides in-depth coverage of best practices based on a deep understanding of the internals of both SQL Server and the Windows operating system. You'll get a thorough look at the SQL Server database architecture and internals as well as Windows OS internals so that you can approach troubleshooting with a solid grasp of the total processing environment. Armed with this comprehensive understanding, readers will then learn how to use a suite of tools for troubleshooting performance problems whether they originate on the database server or operating system side. Topics Covered: SQL Server Architecture Understanding Memory SQL Server Waits and Extended Events Working with Storage CPU and Query Processing Locking and Latches Knowing Tempdb Defining Your Approach To Troubleshooting Viewing Server Performance with PerfMon and the PAL Tool Tracing SQL Server with SQL Trace and Profiler Consolidating Data Collection with SQLDiag and the PerfStats Script Introducing RML Utilities for Stress Testing and Trace File Analysis Bringing It All Together with SQL Nexus Using Management Studio Reports and the Performance Dashboard Using SQL Server Management Data Warehouse Shortcuts to Efficient Data Collection and Quick Analysis Note: CD-ROM/DVD and other supplementary materials are not included as part of eBook file.

Mastering SQL Server® 2008

As Microsoft's bestselling database manager, SQL Server is highly flexible and customizable, and has excellent support—the 2008 version offers several significant new capabilities. This book offers accurate and expert coverage on the updates to SQL Server 2008 such as its enhanced security; the ability to encrypt an entire database, data files, and log files without the need for application changes; a scalable infrastructure that can manage reports and analysis of any size and complexity; and its extensive performance data collection.

Application and Program Performance Analysis Using PEX Statistics on IBM i5/OS

This IBM Redbooks publication is intended for use by those generally familiar with most of the iSeries IBM-provided performance tools available through the i5/OS operating system’s commands and the additional cost Performance Tools for iSeries, 5722-PT1, licensed program. i5/OS comes with a detailed program level performance data collection capability called the Performance Explorer (PEX). i5/OS commands supporting the collection include Add PEX Definition, Start Performance Explorer, and End Performance Explorer. One of the Performance Explorer (PEX) collection options is called Statistics (STATS), which collects the program level performance statistics, including CPU usage, disk I/O activity, and the occurrence of certain i5/OS and System i microcode level events. The Print PEX Report function of 5722-PT1 provides a basic view of this STATS data. PEX Statistics provides a richer interface for collection and analysis of the *STATS performance data than is available through the i5/OS PEX command and the Print PEX Report output.

Beginning InfoPath™ 2003

InfoPath creates forms for data gathering, analysis, and reporting InfoPath has been adopted by many companies, ranging from Toyota and Hewlett-Packard to M/I Homes and New York Presbyterian Hospital, and recent laws that regulate data collection, such as Sarbanes-Oxley and HIPPA, have increased demand Explains how to use InfoPath in a single user mode and how to use it with other databases, such as Access and SQL Server, or in conjunction with XML Web services Shows how to deploy multi-user forms that use InfoPath with collaborative products such as Windows SharePoint Services and BizTalk

XML in Office 2003: Information Sharing with Desktop XML

Co-authors are the world-renowned inventor of markup languages and a developer of the W3C XML Schema specification Detailed coverage of Office 2003 Professional XML features, plus all the XML knowledge you need to use them Learn to edit your XML document with Word, analyze its data with Excel, store it with Access, and publish it to the Web with FrontPage® Build dynamic custom XML forms with the remarkable new InfoPath™ 2003—structured data collection with word processing flexibility From the Foreword by Jean Paoli, Microsoft XML Architect and co-editor of the W3C XML specification: “XML enabled the transfer of information from server to server and server to client, even in cross-platform environments. But the desktop, where documents are created and analyzed by millions of information workers, could not easily participate. Business-critical information was locked inside data storage systems or individual documents, forcing companies to adopt inefficient and duplicative business processes. “This is a book on re-inventing the way millions of people write and interact with documents. It succeeds in communicating the novel underlying vision of Office 2003 XML while focusing on task-oriented, hands-on skills for using the product.” Desktop XML affects every Office 2003 Professional Edition user! It transforms millions of desktop computers from mere word processors into rich clients for Web services, editing front-ends for XML content management systems, and portals for XML-based application integration. And this book shows you how to benefit from it. You’ll learn exactly what XML can do for you, and you’ll master its key concepts, all in the context of the Office products you already know and use. With 200 tested and working code and markup examples and over 150 screenshots and illustrations from the actual shipped product (not betas), you’ll see step by step how: Office users can share documents more easily, without error-prone rework, re-keying, or cut-and-paste. Office data from your documents can be captured for enterprise databases. Office documents can be kept up-to-date with live data from Web Services and enterprise data stores. Office solutions can overcome traditional limitations by using XML and Smart Documents. BONUS XML SKILLS SECTION! All the XML expertise you’ll need, adapted for Office 2003 users from the best-selling Charles F. Goldfarb’s XML Handbook, Fifth Edition: the XML language, XML Schema, XPath, XSLT, Web services … and more! CD-ROM INCLUDED: Provides a fully functional 60-day trial version of Microsoft InfoPath 2003.