talk-data.com talk-data.com

Topic

data

5765

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

5765 activities · Newest first

Using Flume

How can you get your data from frontend servers to Hadoop in near real time? With this complete reference guide, you’ll learn Flume’s rich set of features for collecting, aggregating, and writing large amounts of streaming data to the Hadoop Distributed File System (HDFS), Apache HBase, SolrCloud, Elastic Search, and other systems. Using Flume shows operations engineers how to configure, deploy, and monitor a Flume cluster, and teaches developers how to write Flume plugins and custom components for their specific use-cases. You’ll learn about Flume’s design and implementation, as well as various features that make it highly scalable, flexible, and reliable. Code examples and exercises are available on GitHub. Learn how Flume provides a steady rate of flow by acting as a buffer between data producers and consumers Dive into key Flume components, including sources that accept data and sinks that write and deliver it Write custom plugins to customize the way Flume receives, modifies, formats, and writes data Explore APIs for sending data to Flume agents from your own applications Plan and deploy Flume in a scalable and flexible way—and monitor your cluster once it’s running

HL7 for BizTalk

Vikas Bhardwaj is a Technical Architect at Syntel Inc. Vikas has 14 years of IT experience with Microsoft Technologies like BizTalk Server, .NET, C#, SQL Server. Vikas has implemented various integration solution using BizTalk Server including one of the largest implementation of BizTalk and HL7. Vikas presently lives in Nashville, Tennessee with his wife Poonam and two kids Shivam & Ayaan. You can check out Vikas' blog at http://vikasbhardwaj15.blogspot.com/ and Vikas can be contacted directly at [email protected]. HL7 for BizTalk provides a detailed guide to the planning and delivery of a HL7-compliant system using the dedicated Microsoft BizTalk for HL7 Accelerator. The HL7 Primary Standard, its various versions, and the use of the HL7 Accelerator for BizTalk are broken out and fully explained. HL7 for BizTalk provides clear guidance on the specific healthcare scenarios that HL7 is designed to overcome and provides working case study models of how HL7 solutions can be implemented in BizTalk, deployed in practice and monitored during operation. Special emphasis is given in this book to the BizTalk reporting functionality and its use to provide HL7 oversight within organizations. HL7 for BizTalk is suitable for use with BizTalk versions from 2006 R2 to 2013 R2 to suit the reader organization. All three versions of the HL7 standard and their differences, are explained. Howard S. Edidin is an integrations architect specializing in enterprise application integration. Howard runs his own consulting firm, Edidin Group, Inc, which is a Gold Member of the HL7 International Organization. Howard's firm specializes in delivering HL7 and HIPAA Healthcare solutions and providing guidance in the use of HL7 with BizTalk. Howard is active in several HL7 Working Groups and is involved with the development of a new HL7 Standard. In addition to BizTalk, Howard works with Azure, SQL Server, and SharePoint. Howard and his wife Sharon, live in a northern suburb of Chicago. Howard maintains several blogs, biztalkin-howard.blogspot.com and fhir-biztalk.com. Howard can be contacted directly at [email protected].

IBM z/OS V2.1 DFSMS Technical Update

Each release of IBM® z/OS® DFSMS builds upon the previous version to provide enhanced storage management, data access, device support, program management, and distributed data access for the z/OS platform in a system-managed storage environment. This IBM Redbooks® publication provides a summary of the functions and enhancements integrated into z/OS V2.1 DFSMS. It provides you with the information that you need to understand and evaluate the content of this DFSMS release, along with practical implementation hints and tips. This book is written for storage professionals and system programmers who have experience with the components of DFSMS. It provides sufficient information so that you can start prioritizing the implementation of new functions and evaluating their applicability in your DFSMS environment.

MATLAB Control Systems Engineering

MATLAB is a high-level language and environment for numerical computation, visualization, and programming. Using MATLAB, you can analyze data, develop algorithms, and create models and applications. The language, tools, and built-in math functions enable you to explore multiple approaches and reach a solution faster than with spreadsheets or traditional programming languages, such as C/C++ or Java. MATLAB Control Systems Engineering introduces you to the MATLAB language with practical hands-on instructions and results, allowing you to quickly achieve your goals. In addition to giving an introduction to the MATLAB environment and MATLAB programming, this book provides all the material needed to design and analyze control systems using MATLAB’s specialized Control Systems Toolbox. The Control Systems Toolbox offers an extensive range of tools for classical and modern control design. Using these tools you can create models of linear time-invariant systems in transfer function, zero-pole-gain or state space format. You can manipulate both discrete-time and continuous-time systems and convert between various representations. You can calculate and graph time response, frequency response and loci of roots. Other functions allow you to perform pole placement, optimal control and estimates. The Control System Toolbox is open and extendible, allowing you to create customized M-files to suit your specific applications.

Statistical Inference for Models with Multivariate t-Distributed Errors

This book summarizes the results of various models under normal theory with a brief review of the literature. Statistical Inference for Models with Multivariate t-Distributed Errors: Includes a wide array of applications for the analysis of multivariate observations Emphasizes the development of linear statistical models with applications to engineering, the physical sciences, and mathematics Contains an up-to-date bibliography featuring the latest trends and advances in the field to provide a collective source for research on the topic Addresses linear regression models with non-normal errors with practical real-world examples Uniquely addresses regression models in Student's t-distributed errors and t-models Supplemented with an Instructor's Solutions Manual, which is available via written request by the Publisher

IBM DS8870 Architecture and Implementation

This IBM® Redbooks® publication describes the concepts, architecture, and implementation of the IBM DS8870. The book provides reference information to assist readers who need to plan for, install, and configure the DS8870. The IBM DS8870 is the most advanced model in the IBM DS8000® series and is equipped with IBM POWER7+™ based controllers. Various configuration options are available that scale from dual 2-core systems up to dual 16-core systems with up to 1 TB of cache. The DS8870 features an integrated high-performance flash enclosure with flash cards that can delivers up to 250,000 IOPS and up to 3.4 GBps bandwidth. A high performance all-flash drive configuration is also available. The DS8870 also features enhanced 8 Gbps device adapters and host adapters. Connectivity options, with up to 128 Fibre Channel/IBM FICON® ports for host connections, make the DS8870 suitable for multiple server environments in open systems and IBM System z® environments. The DS8870 supports advanced disaster recovery solutions, business continuity solutions, and thin provisioning. All disk drives in the DS8870 storage system have the Full Disk Encryption (FDE) feature. The DS8870 also can be integrated in a Lightweight Directory Access Protocol (LDAP) infrastructure. The DS8870 can automatically optimize the use of each storage tier, particularly flash drives and flash cards, through the IBM Easy Tier® feature, which is available at no extra charge.

Bayesian Networks for Probabilistic Inference and Decision Analysis in Forensic Science, 2nd Edition

"This book should have a place on the bookshelf of every forensic scientist who cares about the science of evidence interpretation" Dr. Ian Evett, Principal Forensic Services Ltd, London, UK Continuing developments in science and technology mean that the amounts of information forensic scientists are able to provide for criminal investigations is ever increasing. The commensurate increase in complexity creates difficulties for scientists and lawyers with regard to evaluation and interpretation, notably with respect to issues of inference and decision. Probability theory, implemented through graphical methods, and specifically Bayesian networks, provides powerful methods to deal with this complexity. Extensions of these methods to elements of decision theory provide further support and assistance to the judicial system. Bayesian Networks for Probabilistic Inference and Decision Analysis in Forensic Science provides a unique and comprehensive introduction to the use of Bayesian decision networks for the evaluation and interpretation of scientific findings in forensic science, and for the support of decision-makers in their scientific and legal tasks. Includes self-contained introductions to probability and decision theory. Develops the characteristics of Bayesian networks, object-oriented Bayesian networks and their extension to decision models. Features implementation of the methodology with reference to commercial and academically available software. Presents standard networks and their extensions that can be easily implemented and that can assist in the reader's own analysis of real cases. Provides a technique for structuring problems and organizing data based on methods and principles of scientific reasoning. Contains a method for the construction of coherent and defensible arguments for the analysis and evaluation of scientific findings and for decisions based on them. Is written in a lucid style, suitable for forensic scientists and lawyers with minimal mathematical background. Includes a foreword by Ian Evett. The clear and accessible style of this second edition makes this book ideal for all forensic scientists, applied statisticians and graduate students wishing to evaluate forensic findings from the perspective of probability and decision analysis. It will also appeal to lawyers and other scientists and professionals interested in the evaluation and interpretation of forensic findings, including decision making based on scientific information.

Designing and Conducting Survey Research: A Comprehensive Guide, 4th Edition

The industry standard guide, updated with new ideas and SPSS analysis techniques Designing and Conducting Survey Research: A Comprehensive Guide Fourth Edition is the industry standard resource that covers all major components of the survey process, updated to include new data analysis techniques and SPSS procedures with sample data sets online. The book offers practical, actionable guidance on constructing the instrument, administrating the process, and analyzing and reporting the results, providing extensive examples and worksheets that demonstrate the appropriate use of survey and data techniques. By clarifying complex statistical concepts and modern analysis methods, this guide enables readers to conduct a survey research project from initial focus concept to the final report. Public and nonprofit managers with survey research responsibilities need to stay up-to-date on the latest methods, techniques, and best practices for optimal data collection, analysis, and reporting. Designing and Conducting Survey Research is a complete resource, answering the "what", "why", and "how" every step of the way, and providing the latest information about technological advancements in data analysis. The updated fourth edition contains step-by-step SPSS data entry and analysis procedures, as well as SPSS examples throughout the text, using real data sets from real-world studies. Other new information includes topics like: Nonresponse error/bias Ethical concerns and special populations Cell phone samples in telephone surveys Subsample screening and complex skip patterns The fourth edition also contains new information on the growing importance of focus groups, and places a special emphasis on data quality including size and variability. Those who employ survey research methods will find that Designing and Conducting Survey Research contains all the information needed to better design, conduct, and analyze a more effective survey.

Professional Microsoft SQL Server 2014 Administration

Learn to take advantage of the opportunities offered by SQL Server 2014 Microsoft's SQL Server 2014 update means big changes for database administrators, and you need to get up to speed quickly because your methods, workflow, and favorite techniques will be different from here on out. The update's enhanced support of large-scale enterprise databases and significant price advantage mean that SQL Server 2014 will become even more widely adopted across the industry. The update includes new backup and recovery tools, new AlwaysOn features, and enhanced cloud capabilities. In-memory OLTP, Buffer Pool Extensions for SSDs, and a new Cardinality Estimator can improve functionality and smooth out the workflow, but only if you understand their full capabilities. Professional Microsoft SQL Server 2014 is your comprehensive guide to working with the new environment. Authors Adam Jorgensen, Bradley Ball, Ross LoForte, Steven Wort, and Brian Knight are the dream team of the SQL Server community, and they put their expertise to work guiding you through the changes. Improve oversight with better management and monitoring Protect your work with enhanced security features Upgrade performance tuning, scaling, replication, and clustering Learn new options for backup and recovery Professional Microsoft SQL Server 2014 includes a companion website with sample code and efficient automation utilities, plus a host of tips, tricks, and workarounds that will make your job as a DBA or database architect much easier. Stop getting frustrated with administrative issues and start taking control. Professional Microsoft SQL Server 2014 is your roadmap to mastering the update and creating solutions that work.

Sams Teach Yourself NoSQL with MongoDB in 24 Hours

NoSQL database usage is growing at a stunning 50% per year, as organizations discover NoSQL's potential to address even the most challenging Big Data and real-time database problems. Every NoSQL database is different, but one is the most popular by far: MongoDB. Now, in just 24 lessons of one hour or less, you can learn how to leverage MongoDB's immense power. Each short, easy lesson builds on all that's come before, teaching NoSQL concepts and MongoDB techniques from the ground up. Sams Teach Yourself NoSQL with MongoDB in 24 Hours covers all this, and much more: Learning how NoSQL is different, when to use it, and when to use traditional RDBMSes instead Designing and implementing MongoDB databases of diverse types and sizes Storing and interacting with data via Java, PHP, Python, and Node.js/Mongoose Choosing the right NoSQL distribution model for your application Installing and configuring MongoDB Designing MongoDB data models, including collections, indexes, and GridFS Balancing consistency, performance, and durability Leveraging the immense power of Map-Reduce Administering, monitoring, securing, backing up, and repairing MongoDB databases Mastering advanced techniques such as sharding and replication Optimizing performance

Implementing IBM Software Defined Network for Virtual Environments

This IBM® Redbooks® publication shows how to integrate IBM Software Defined Network for Virtual Environments (IBM SDN VE) seamlessly within a new or existing data center. This book is aimed at pre- and post-sales support, targeting network administrators and other technical professionals that want to get an overview of this new and exciting technology, and see how it fits into the overall vision of a truly Software Defined Environment. It shows you all of the steps that are required to design, install, maintain, and troubleshoot the IBM SDN VE product. It also highlights specific, real-world examples that showcase the power and flexibility that IBM SDN VE has over traditional solutions with a legacy network infrastructure that is applied to virtual systems. This book assumes that you have a general familiarity with networking and virtualization. It does not assume an in-depth understanding of KVM or VMware. It is written for administrators who want to get a quick start with IBM SDN VE in their respective virtualized infrastructure, and to get some virtual machines up and running by using the rich features of the product in a short amount of time (days, not week, or months).

Pro Apache Hadoop, Second Edition

Pro Apache Hadoop, Second Edition brings you up to speed on Hadoop the framework of big data. Revised to cover Hadoop 2.0, the book covers the very latest developments such as YARN (aka MapReduce 2.0), new HDFS high-availability features, and increased scalability in the form of HDFS Federations. All the old content has been revised too, giving the latest on the ins and outs of MapReduce, cluster design, the Hadoop Distributed File System, and more. This book covers everything you need to build your first Hadoop cluster and begin analyzing and deriving value from your business and scientific data. Learn to solve big-data problems the MapReduce way, by breaking a big problem into chunks and creating small-scale solutions that can be flung across thousands upon thousands of nodes to analyze large data volumes in a short amount of wall-clock time. Learn how to let Hadoop take care of distributing and parallelizing your softwareyou just focus on the code; Hadoop takes care of the rest. Covers all that is new in Hadoop 2.0 Written by a professional involved in Hadoop since day one Takes you quickly to the seasoned pro level on the hottest cloud-computing framework

Actionable Intelligence: A Guide to Delivering Business Results with Big Data Fast!

Building an analysis ecosystem for a smarter approach to intelligence Keith Carter's Actionable Intelligence: A Guide to Delivering Business Results with Big Data Fast! is the comprehensive guide to achieving the dream that business intelligence practitioners have been chasing since the concept itself came into being. Written by an IT visionary with extensive global supply chain experience and insight, this book describes what happens when team members have accurate, reliable, usable, and timely information at their fingertips. With a focus on leveraging big data, the book provides expert guidance on developing an analytical ecosystem to effectively manage, use the internal and external information to deliver business results. This book is written by an author who's been in the trenches for people who are in the trenches. It's for practitioners in the real world, who know delivering results is easier said than done - fraught with failure, and difficult politics. A landscape where reason and passion are needed to make a real difference. This book lays out the appropriate way to establish a culture of fact-based decision making, innovation, forward looking measurements, and appropriate high-speed governance. Readers will enable their organization to: Answer strategic questions faster Reduce data acquisition time and increase analysis time to improve outcomes Shift the focus to positive results rather than past failures Expand opportunities by more effectively and thoughtfully leveraging information Big data makes big promises, but it cannot deliver without the right recipe of people, processes and technology in place. It's about choosing the right people, giving them the right tools, and taking a thoughtful—rather than formulaic--approach. Actionable Intelligence provides expert guidance toward envisioning, budgeting, implementing, and delivering real benefits.

Cognitive Interviewing Methodology

AN INTERDISCIPLINARY PERSPECTIVE TO THE EVOLUTION OF THEORY AND METHODOLOGY WITHIN COGNITIVE INTERVIEW PROCESSES Providing a comprehensive approach to cognitive interviewing in the field of survey methodology, Cognitive Interviewing Methodology delivers a clear guide that draws upon modern, cutting-edge research from a variety of fields. Each chapter begins by summarizing the prevailing paradigms that currently dominate the field of cognitive interviewing. Then underlying theoretical foundations are presented, which supplies readers with the necessary background to understand newly-evolving techniques in the field. The theories lead into developed and practiced methods by leading practitioners, researchers, and/or academics. Finally, the edited guide lays out the limitations of cognitive interviewing studies and explores the benefits of cognitive interviewing with other methodological approaches. With a primary focus on question evaluation, Cognitive Interviewing Methodology also includes: Step-by-step procedures for conducting cognitive interviewing studies, which includes the various aspects of data collection, questionnaire design, and data interpretation Newly developed tools to benefit cognitive interviewing studies as well as the field of question evaluation, such as Q-Notes, a data entry and analysis software application, and Q-Bank, an online resource that houses question evaluation studies A unique method for questionnaire designers, survey managers, and data users to analyze, present, and document survey data results from a cognitive interviewing study An excellent reference for survey researchers and practitioners in the social sciences who utilize cognitive interviewing techniques in their everyday work, Cognitive Interviewing Methodology is also a useful supplement for courses on survey methods at the upper-undergraduate and graduate-level.

Robust Cluster Analysis and Variable Selection

Clustering remains a vibrant area of research in statistics. Although there are many books on this topic, there are relatively few that are well founded in the theoretical aspects. In Robust Cluster Analysis and Variable Selection, Gunter Ritter presents an overview of the theory and applications of probabilistic clustering and variable selection, synthesizing the key research results of the last 50 years. The author focuses on the robust clustering methods he found to be the most useful on simulated data and real-time applications. The book provides clear guidance for the varying needs of both applications, describing scenarios in which accuracy and speed are the primary goals. Robust Cluster Analysis and Variable Selection includes all of the important theoretical details, and covers the key probabilistic models, robustness issues, optimization algorithms, validation techniques, and variable selection methods. The book illustrates the different methods with simulated data and applies them to real-world data sets that can be easily downloaded from the web. This provides you with guidance in how to use clustering methods as well as applicable procedures and algorithms without having to understand their probabilistic fundamentals.

Statistical Process Control for Managers

If you have been frustrated by very technical statistical process control (SPC) training materials, then this is the book for you. This book focuses on how SPC works and why managers should consider using it in their operations. It provides you with a conceptual understanding of SPC so that appropriate decisions can be made about the benefits of incorporating SPC into the process management and quality improvement processes. Today, there is little need to make the necessary calculations by hand, so the author utilizes Minitab and NWA Quality Analyst—two of the most popular statistical analysis software packages on the market. Links are provided to the home pages of these software packages where trial versions may be downloaded for evaluation and trial use. The book also addresses the question of why SPC should be considered for use, the process of implementing SPC, how to incorporate SPC into problem identification, problem solving, and the management and improvement of processes, products, and services.

Oracle PL/SQL Performance Tuning Tips & Techniques

Proven PL/SQL Optimization Solutions In Oracle PL/SQL Performance Tuning Tips & Techniques, Oracle ACE authors with decades of experience building complex production systems for government, industry, and educational organizations present a hands-on approach to enabling optimal results from PL/SQL. The book begins by describing the discovery process required to pinpoint performance problems and then provides measurable and repeatable test cases. In-depth coverage of linking SQL and PL/SQL is followed by deep dives into essential Oracle Database performance tuning tools. Real-world examples and best practices are included throughout this Oracle Press guide. Follow a request-driven nine-step process to identify and address performance problems in web applications Use performance-related database tools, including data dictionary views, logging, tracing, PL/SQL Hierarchical Profiler, PL/Scope, and RUNSTATS Instrument code to pinpoint performance issues using call stack APIs, error stack APIs, and timing markers Embed PL/SQL in SQL and manage user-defined functions Embed SQL in PL/SQL using a set-based approach to handle large volumes of data Properly write and deploy data manipulation language triggers to avoid performance problems Work with advanced datatypes, including LOBs and XML Use caching techniques to avoid redundant operations Effectively use dynamic SQL to reduce the amount of code needed and streamline system management Manage version control and ensure that performance fixes are successfully deployed Code examples in the book are available for download.

Modern Enterprise Business Intelligence and Data Management

Nearly every large corporation and governmental agency is taking a fresh look at their current enterprise-scale business intelligence (BI) and data warehousing implementations at the dawn of the "Big Data Era"…and most see a critical need to revitalize their current capabilities. Whether they find the frustrating and business-impeding continuation of a long-standing "silos of data" problem, or an over-reliance on static production reports at the expense of predictive analytics and other true business intelligence capabilities, or a lack of progress in achieving the long-sought-after enterprise-wide "single version of the truth" – or all of the above – IT Directors, strategists, and architects find that they need to go back to the drawing board and produce a brand new BI/data warehousing roadmap to help move their enterprises from their current state to one where the promises of emerging technologies and a generation’s worth of best practices can finally deliver high-impact, architecturally evolvable enterprise-scale business intelligence and data warehousing. Author Alan Simon, whose BI and data warehousing experience dates back to the late 1970s and who has personally delivered or led more than thirty enterprise-wide BI/data warehousing roadmap engagements since the mid-1990s, details a comprehensive step-by-step approach to building a best practices-driven, multi-year roadmap in the quest for architecturally evolvable BI and data warehousing at the enterprise scale. Simon addresses the triad of technology, work processes, and organizational/human factors considerations in a manner that blends the visionary and the pragmatic. Takes a fresh look at true enterprise-scale BI/DW in the "Dawn of the Big Data Era" Details a checklist-based approach to surveying one’s current state and identifying which components are enterprise-ready and which ones are impeding the key objectives of enterprise-scale BI/DW Provides an approach for how to analyze and test-bed emerging technologies and architectures and then figure out how to include the relevant ones in the roadmaps that will be developed Presents a tried-and-true methodology for building a phased, incremental, and iterative enterprise BI/DW roadmap that is closely aligned with an organization’s business imperatives, organizational culture, and other considerations

Pro Couchbase Server

The NoSQL movement has fundamentally changed the database world in recent years. Influenced by the growing needs of web-scale applications, NoSQL databases such as Couchbase Server provide new approaches to scalability, reliability, and performance. With the power and flexibility of Couchbase Server, you can model your data however you want, and easily change the data model any time you want. Pro Couchbase Server is a hands-on guide for developers and administrators who want to take advantage of the power and scalability of Couchbase Server in their applications. This book takes you from the basics of NoSQL database design, through application development, to Couchbase Server administration. Never have document databases been so powerful and performant. Pro Couchbase Server shows what is possible and helps you take full advantage of Couchbase Server and all the performance and scalability that it offers. Helps you design and develop a document database using Couchbase Server. Takes you through deploying and maintaining Couchbase Server. Gives you the tools to scale out your application as needed.

Visual Storytelling with D3: An Introduction to Data Visualization in JavaScript™

Master D3, Today’s Most Powerful Tool for Visualizing Data on the Web Data-driven graphics are everywhere these days, from websites and mobile apps to interactive journalism and high-end presentations. Using D3, you can create graphics that are visually stunning and powerfully effective. is a hands-on, full-color tutorial that teaches you to design charts and data visualizations to tell your story quickly and intuitively, and that shows you how to wield the powerful D3 JavaScript library. Visual Storytelling with D3 Drawing on his extensive experience as a professional graphic artist, writer, and programmer, Ritchie S. King walks you through a complete sample project—from conception through data selection and design. Step by step, you’ll build your skills, mastering increasingly sophisticated graphical forms and techniques. If you know a little HTML and CSS, you have all the technical background you’ll need to master D3. This tutorial is for web designers creating graphics-driven sites, services, tools, or dashboards; online journalists who want to visualize their content; researchers seeking to communicate their results more intuitively; marketers aiming to deepen their connections with customers; and for any data visualization enthusiast. Coverage includes Identifying a data-driven story and telling it visually Creating and manipulating beautiful graphical elements with SVG Shaping web pages with D3 Structuring data so D3 can easily visualize it Using D3’s data joins to connect your data to the graphical elements on a web page Sizing and scaling charts, and adding axes to them Loading and filtering data from external standalone datasets Animating your charts with D3’s transitions Adding interactivity to visualizations, including a play button that cycles through different views of your data Finding D3 resources and getting involved in the thriving online D3 community About the Website All of this book’s examples are available at ritchiesking.com/book, along with video tutorials, updates, supporting material, and even more examples, as they become available.