talk-data.com talk-data.com

Topic

data

5765

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

5765 activities · Newest first

Web and Network Data Science: Modeling Techniques in Predictive Analytics

Master modern web and network data modeling: both theory and applications. In a top faculty member of Northwestern University’s prestigious analytics program presents the first fully-integrated treatment of both the business and academic elements of web and network modeling for predictive analytics. Web and Network Data Science, Some books in this field focus either entirely on business issues (e.g., Google Analytics and SEO); others are strictly academic (covering topics such as sociology, complexity theory, ecology, applied physics, and economics). This text gives today's managers and students what they really need: integrated coverage of concepts, principles, and theory in the context of real-world applications. Building on his pioneering Web Analytics course at Northwestern University, Thomas W. Miller covers usability testing, Web site performance, usage analysis, social media platforms, search engine optimization (SEO), and many other topics. He balances this practical coverage with accessible and up-to-date introductions to both social network analysis and network science, demonstrating how these disciplines can be used to solve real business problems.

Big Data and Health Analytics

Data availability is surpassing existing paradigms for governing, managing, analyzing, and interpreting health data. Big Data and Health Analytics provides frameworks, use cases, and examples that illustrate the role of big data and analytics in modern health care, including how public health information can inform health delivery. Written for health care professionals and executives, this is not a technical book on the use of statistics and machine-learning algorithms for extracting knowledge out of data, nor a book on the intricacies of database design. Instead, this book presents the current thinking of academic and industry researchers and leaders from around the world. Using non-technical language, this book is accessible to health care professionals who might not have an IT and analytics background. It includes case studies that illustrate the business processes underlying the use of big data and health analytics to improve health care delivery. Highlighting lessons learned from the case studies, the book supplies readers with the foundation required for further specialized study in health analytics and data management. Coverage includes community health information, information visualization which offers interactive environments and analytic processes that support exploration of EHR data, the governance structure required to enable data analytics and use, federal regulations and the constraints they place on analytics, and information security. Links to websites, videos, articles, and other online content that expand and support the primary learning objectives for each major section of the book are also included to help you develop the skills you will need to achieve quality improvements in health care delivery through the effective use of data and analytics.

Principles of System Identification

Master Techniques and Successfully Build Models Using a Single Resource Vital to all data-driven or measurement-based process operations, system identification is an interface that is based on observational science, and centers on developing mathematical models from observed data. Principles of System Identification: Theory and Practice is an introductory-level book that presents the basic foundations and underlying methods relevant to system identification. The overall scope of the book focuses on system identification with an emphasis on practice, and concentrates most specifically on discrete-time linear system identification. Useful for Both Theory and Practice The book presents the foundational pillars of identification, namely, the theory of discrete-time LTI systems, the basics of signal processing, the theory of random processes, and estimation theory. It explains the core theoretical concepts of building (linear) dynamic models from experimental data, as well as the experimental and practical aspects of identification. The author offers glimpses of modern developments in this area, and provides numerical and simulation-based examples, case studies, end-of-chapter problems, and other ample references to code for illustration and training. Comprising 26 chapters, and ideal for coursework and self-study, this extensive text: Provides the essential concepts of identification Lays down the foundations of mathematical descriptions of systems, random processes, and estimation in the context of identification Discusses the theory pertaining to non-parametric and parametric models for deterministic-plus-stochastic LTI systems in detail Demonstrates the concepts and methods of identification on different case-studies Presents a gradual development of state-space identification and grey-box modeling Offers an overview of advanced topics of identification namely the linear time-varying (LTV), non-linear, and closed-loop identification Discusses a multivariable approach to identification using the iterative principal component analysis Embeds MATLAB® codes for illustrated examples in the text at the respective points presents a formal base in LTI deterministic and stochastic systems modeling and estimation theory; it is a one-stop reference for introductory to moderately advanced courses on system identification, as well as introductory courses on stochastic signal processing or time-series analysis.The MATLAB scripts and SIMULINK models used as examples and case studies in the book are also available on the author's website: http://arunkt.wix.com/homepage#!textbook/c397 Principles of System Identification: Theory and Practice

Beginning Apache Cassandra Development

Beginning Apache Cassandra Development introduces you to one of the most robust and best-performing NoSQL database platforms on the planet. Apache Cassandra is a document database following the JSON document model. It is specifically designed to manage large amounts of data across many commodity servers without there being any single point of failure. This design approach makes Apache Cassandra a robust and easy-to-implement platform when high availability is needed. Apache Cassandra can be used by developers in Java, PHP, Python, and JavaScript—the primary and most commonly used languages. In Beginning Apache Cassandra Development, author and Cassandra expert Vivek Mishra takes you through using Apache Cassandra from each of these primary languages. Mishra also covers the Cassandra Query Language (CQL), the Apache Cassandra analog to SQL. You'll learn to develop applications sourcing data from Cassandra, query that data, and deliver it at speed to your application's users. Cassandra is one of the leading NoSQL databases, meaning you get unparalleled throughput and performance without the sort of processing overhead that comes with traditional proprietary databases. Beginning Apache Cassandra Development will therefore help you create applications that generate search results quickly, stand up to high levels of demand, scale as your user base grows, ensure operational simplicity, and—not least—provide delightful user experiences.

Introduction to High-Dimensional Statistics

Ever-greater computing technologies have given rise to an exponentially growing volume of data. Today massive data sets (with potentially thousands of variables) play an important role in almost every branch of modern human activity, including networks, finance, and genetics. However, analyzing such data has presented a challenge for statisticians and data analysts and has required the development of new statistical methods capable of separating the signal from the noise. Introduction to High-Dimensional Statistics is a concise guide to state-of-the-art models, techniques, and approaches for handling high-dimensional data. The book is intended to expose the reader to the key concepts and ideas in the most simple settings possible while avoiding unnecessary technicalities. Offering a succinct presentation of the mathematical foundations of high-dimensional statistics, this highly accessible text: Describes the challenges related to the analysis of high-dimensional data Covers cutting-edge statistical methods including model selection, sparsity and the lasso, aggregation, and learning theory Provides detailed exercises at the end of every chapter with collaborative solutions on a wikisite Illustrates concepts with simple but clear practical examples Introduction to High-Dimensional Statistics is suitable for graduate students and researchers interested in discovering modern statistics for massive data. It can be used as a graduate text or for self-study.

Statistical Computing in Nuclear Imaging

Statistical Computing in Nuclear Imaging introduces aspects of Bayesian computing in nuclear imaging. The book provides an introduction to Bayesian statistics and concepts and is highly focused on the computational aspects of Bayesian data analysis of photon-limited data acquired in tomographic measurements. Basic statistical concepts, elements of decision theory, and counting statistics, including models of photon-limited data and Poisson approximations, are discussed in the first chapters. Monte Carlo methods and Markov chains in posterior analysis are discussed next along with an introduction to nuclear imaging and applications such as PET and SPECT. The final chapter includes illustrative examples of statistical computing, based on Poisson-multinomial statistics. Examples include calculation of Bayes factors and risks as well as Bayesian decision making and hypothesis testing. Appendices cover probability distributions, elements of set theory, multinomial distribution of single-voxel imaging, and derivations of sampling distribution ratios. C++ code used in the final chapter is also provided. The text can be used as a textbook that provides an introduction to Bayesian statistics and advanced computing in medical imaging for physicists, mathematicians, engineers, and computer scientists. It is also a valuable resource for a wide spectrum of practitioners of nuclear imaging data analysis, including seasoned scientists and researchers who have not been exposed to Bayesian paradigms.

MATLAB Symbolic Algebra and Calculus Tools

MATLAB is a high-level language and environment for numerical computation, visualization, and programming. Using MATLAB, you can analyze data, develop algorithms, and create models and applications. The language, tools, and built-in math functions enable you to explore multiple approaches and reach a solution faster than with spreadsheets or traditional programming languages, such as C/C++ or Java. MATLAB Symbolic Algebra and Calculus Tools introduces you to the MATLAB language with practical hands-on instructions and results, allowing you to quickly achieve your goals. Starting with a look at symbolic variables and functions, you will learn how to solve equations in MATLAB, both symbolically and numerically, and how to simplify the results. Extensive coverage of polynomial solutions, inequalities and systems of equations are covered in detail. You will see how MATLAB incorporates vector, matrix and character variables, and functions thereof. MATLAB is a powerful symbolic manipulator which enables you to factorize, expand and simplify complex algebraic expressions over all common fields (including over finite fields and algebraic field extensions of the rational numbers). With MATLAB you can also work with ease in matrix algebra, making use of commands which allow you to find eigenvalues, eigenvectors, determinants, norms and various matrix decompositions, among many other features. Lastly, you will see how you can use MATLAB to explore mathematical analysis, finding limits of sequences and functions, sums of series, integrals, derivatives and solving differential equation.

Practical Hadoop Security

Practical Hadoop Security is an excellent resource for administrators planning a production Hadoop deployment who want to secure their Hadoop clusters. A detailed guide to the security options and configuration within Hadoop itself, author Bhushan Lakhe takes you through a comprehensive study of how to implement defined security within a Hadoop cluster in a hands-on way. You will start with a detailed overview of all the security options available for Hadoop, including popular extensions like Kerberos and OpenSSH, and then delve into a hands-on implementation of user security (with illustrated code samples) with both in-the-box features and with security extensions implemented by leading vendors. No security system is complete without a monitoring and tracing facility, so Practical Hadoop Security next steps you through audit logging and monitoring technologies for Hadoop, as well as ready to use implementation and configuration examples--again with illustrated code samples. The book concludes with the most important aspect of Hadoop security – encryption. Both types of encryptions, for data in transit and data at rest, are discussed at length with leading open source projects that integrate directly with Hadoop at no licensing cost. Practical Hadoop Security: Explains importance of security, auditing and encryption within a Hadoop installation Describes how the leading players have incorporated these features within their Hadoop distributions and provided extensions Demonstrates how to set up and use these features to your benefit and make your Hadoop installation secure without impacting performance or ease of use

Data Scientists at Work

Data Scientists at Work is a collection of interviews with sixteen of the world's most influential and innovative data scientists from across the spectrum of this hot new profession. "Data scientist is the sexiest job in the 21st century," according to the Harvard Business Review. By 2018, the United States will experience a shortage of 190,000 skilled data scientists, according to a McKinsey report. Through incisive in-depth interviews, this book mines the what, how, and why of the practice of data science from the stories, ideas, shop talk, and forecasts of its preeminent practitioners across diverse industries: social network (Yann LeCun, Facebook); professional network (Daniel Tunkelang, LinkedIn); venture capital (Roger Ehrenberg, IA Ventures); enterprise cloud computing and neuroscience (Eric Jonas, formerly Salesforce.com); newspaper and media (Chris Wiggins, The New York Times); streaming television (Caitlin Smallwood, Netflix); music forecast (Victor Hu, Next Big Sound); strategic intelligence (Amy Heineike, Quid); environmental big data (Andre´ Karpis?ts?enkoEach of these data scientists shares how he or she tailors the torrent-taming techniques of big data, data visualization, search, and statistics to specific jobs by dint of ingenuity, imagination, patience, and passion. , Planet OS); geospatial marketing intelligence (Jonathan Lenaghan, PlaceIQ); advertising (Claudia Perlich, Dstillery); fashion e-commerce (Anna Smith, Rent the Runway); specialty retail (Erin Shellman, Nordstrom); email marketing (John Foreman, MailChimp); predictive sales intelligence (Kira Radinsky, SalesPredict); and humanitarian nonprofit (Jake Porway, DataKind). The book features a stimulating foreword by Google's Director of Research, Peter Norvig. Data Scientists at Work parts the curtain on the interviewees’ earliest data projects, how they became data scientists, their discoveries and surprises in working with data, their thoughts on the past, present, and future of the profession, their experiences of team collaboration within their organizations, and the insights they have gained as they get their hands dirty refining mountains of raw data into objects of commercial, scientific, and educational value for their organizations and clients.

eXist

Get a head start with eXist, the open source NoSQL database and application development platform built entirely around XML technologies. With this hands-on guide, you’ll learn eXist from the ground up, from using this feature-rich database to work with millions of documents to building complex web applications that take advantage of eXist’s many extensions. If you’re familiar with XML—as a student, professor, publisher, or developer—you’ll find that eXist is ideal for all kinds of documents.

Big Data Now: 2014 Edition

In the four years that O'Reilly Media, Inc. has produced its annual Big Data Now report, the data field has grown from infancy into young adulthood. Data is now a leader in some fields and a driver of innovation in others, and companies that use data and analytics to drive decision-making are outperforming their peers. And while access to big data tools and techniques once required significant expertise, today many tools have improved and communities have formed to share best practices. Companies have also started to emphasize the importance of processes, culture, and people. The topics in represent the major forces currently shaping the data world: Big Data Now: 2014 Edition Cognitive augmentation: predictive APIs, graph analytics, and Network Science dashboards Intelligence matters: defining AI, modeling intelligence, deep learning, and "summoning the demon" Cheap sensors, fast networks, and distributed computing: stream processing, hardware data flows, and computing at the edge Data (science) pipelines: broadening the coverage of analytic pipelines with specialized tools Evolving marketplace of big data components: SSDs, Hadoop 2, Spark; and why datacenters need operating systems Design and social science: human-centered design, wearables and real-time communications, and wearable etiquette Building a data culture: moving from prediction to real-time adaptation; and why you need to become a data skeptic Perils of big data: data redlining, intrusive data analysis, and the state of big data ethics

Accelerating MATLAB Performance

The MATLAB® programming environment is often perceived as a platform suitable for prototyping and modeling but not for "serious" applications. One of the main complaints is that MATLAB is just too slow. Accelerating MATLAB Performance aims to correct this perception by describing multiple ways to greatly improve MATLAB program speed. Packed with thousands of helpful tips, it leaves no stone unturned, discussing every aspect of MATLAB. Ideal for novices and professionals alike, the book describes MATLAB performance in a scale and depth never before published. It takes a comprehensive approach to MATLAB performance, illustrating numerous ways to attain the desired speedup. The book covers MATLAB, CPU, and memory profiling and discusses various tradeoffs in performance tuning. It describes both the application of standard industry techniques in MATLAB, as well as methods that are specific to MATLAB such as using different data types or built-in functions. The book covers MATLAB vectorization, parallelization (implicit and explicit), optimization, memory management, chunking, and caching. It explains MATLAB’s memory model and details how it can be leveraged. It describes the use of GPU, MEX, FPGA, and other forms of compiled code, as well as techniques for speeding up deployed applications. It details specific tips for MATLAB GUI, graphics, and I/O. It also reviews a wide variety of utilities, libraries, and toolboxes that can help to improve performance. Sufficient information is provided to allow readers to immediately apply the suggestions to their own MATLAB programs. Extensive references are also included to allow those who wish to expand the treatment of a particular topic to do so easily. Supported by an active website, and numerous code examples, the book will help readers rapidly attain significant reductions in development costs and program run times.

Augmented Reality Law, Privacy, and Ethics

Augmented Reality (AR) is the blending of digital information in a real-world environment. A common example can be seen during any televised football game, in which information about the game is digitally overlaid on the field as the players move and position themselves. Another application is Google Glass, which enables users to see AR graphics and information about their location and surroundings on the lenses of their "digital eyewear", changing in real-time as they move about. Augmented Reality Law, Privacy, and Ethics is the first book to examine the social, legal, and ethical issues surrounding AR technology. Digital eyewear products have very recently thrust this rapidly-expanding field into the mainstream, but the technology is so much more than those devices. Industry analysts have dubbed AR the "eighth mass medium" of communications. Science fiction movies have shown us the promise of this technology for decades, and now our capabilities are finally catching up to that vision. Augmented Reality will influence society as fundamentally as the Internet itself has done, and such a powerful medium cannot help but radically affect the laws and norms that govern society. No author is as uniquely qualified to provide a big-picture forecast and guidebook for these developments as Brian Wassom. A practicing attorney, he has been writing on AR law since 2007 and has established himself as the world's foremost thought leader on the intersection of law, ethics, privacy, and AR. Augmented Reality professionals around the world follow his Augmented Legality® blog. This book collects and expands upon the best ideas expressed in that blog, and sets them in the context of a big-picture forecast of how AR is shaping all aspects of society. Augmented reality thought-leader Brian Wassom provides you with insight into how AR is changing our world socially, ethically, and legally. Includes current examples, case studies, and legal cases from the frontiers of AR technology. Learn how AR is changing our world in the areas of civil rights, privacy, litigation, courtroom procedure, addition, pornography, criminal activity, patent, copyright, and free speech. An invaluable reference guide to the impacts of this cutting-edge technology for anyone who is developing apps for it, using it, or affected by it in daily life.

Cloud Enabling IBM CICS

This IBM® Redbooks® publication takes an existing IBM 3270-COBOL-VSAM application and describes how to use the features of IBM Customer Information Control System (CICS®) Transaction Server (CICS TS) cloud enablement. Working with the General Insurance Application (GENAPP) as an example, this book describes the steps needed to monitor both platform and application health using the CICS Explorer CICS Cloud perspective. It also shows you how to apply threshold policy and measure resource usage, all without source code changes to the original application. In addition, this book describes how to use multi-versioning to safely and reliably apply and back out application changes. This Redbooks publication includes instructions about the following topics: How to create a CICS TS platform to manage and reflect the health of a set of CICS TS regions, and the services that they provide to applications How to quickly get value from CICS TS applications, by creating and deploying a CICS TS application for an existing user application How to protect your CICS TS platform from erroneous applications by using threshold policies How to deploy and run multiple versions of the same CICS TS application on the same CICS TS platform at the same time, enabling a safer migration from one application version to another, with no downtime How to measure application resource usage, enabling a comparison of the performance of different application versions, and chargeback based on application use

FileMaker® Pro 13 Absolute Beginner’s Guide

Make the most of FileMaker Pro 13– without becoming a technical expert! This book is the fastest way to create FileMaker Pro databases that perform well, are easy to manage, solve problems, and achieve your goals! Even if you’ve never used FileMaker Pro before, you’ll learn how to do what you want, one incredibly clear and easy step at a time. FileMaker Pro has never, ever been this simple! Who knew how simple FileMaker® Pro 13 could be? This is the easiest, most practical beginner’s guide to using the powerful new FileMaker Pro 13 database program…simple, reliable instructions for doing everything you really want to do! Here’s a small sample of what you’ll learn: • Get comfortable with the FileMaker Pro environment, and discover all you can do with it • Create complete databases instantly with Starter Solutions • Design custom databases that efficiently meet your specific needs • Identify the right tables, fields, and relationships; create new databases from scratch • Expand your database to integrate new data and tables • Craft layouts that make your database easier and more efficient to use • Quickly find, sort, organize, import, and export data • Create intuitive, visual reports and graphs for better decision-making • Use scripts to automate a wide variety of routine tasks • Safeguard databases with accounts, privileges, and reliable backups • Share data with colleagues running iPads, iPhones, Windows computers, or Macs • Take your data with you through FileMaker Go • Master expert tips and hidden features you’d never find on your own • And much more…

Google Earth Forensics

Google Earth Forensics is the first book to explain how to use Google Earth in digital forensic investigations. This book teaches you how to leverage Google's free tool to craft compelling location-based evidence for use in investigations and in the courtroom. It shows how to extract location-based data that can be used to display evidence in compelling audiovisual manners that explain and inform the data in contextual, meaningful, and easy-to-understand ways. As mobile computing devices become more and more prevalent and powerful, they are becoming more and more useful in the field of law enforcement investigations and forensics. Of all the widely used mobile applications, none have more potential for helping solve crimes than those with geo-location tools. Written for investigators and forensic practitioners, Google Earth Forensics is written by an investigator and trainer with more than 13 years of experience in law enforcement who will show you how to use this valuable tool anywhere at the crime scene, in the lab, or in the courtroom. Learn how to extract location-based evidence using the Google Earth program or app on computers and mobile devices Covers the basics of GPS systems, the usage of Google Earth, and helps sort through data imported from external evidence sources Includes tips on presenting evidence in compelling, easy-to-understand formats

IBM IMS Solutions for Automating Database Management

Over the last few years, IBM® IMS™ and IMS tools have been modernizing the interfaces to IMS and the IMS tools to bring them more in line with the current interface designs. As the mainframe software products are becoming more integrated with the Windows and mobile environments, a common approach to interfaces is becoming more relevant. The traditional 3270 interface with ISPF as the main interface is no longer the only way to do some of these processes. There is also a need to provide more of a common looking interface so the tools do not have a product-specific interface. This allows more cross product integration. Eclipse and web-based interfaces being used in a development environment, tooling using those environments provides productivity improvements in that the interfaces are common and familiar. IMS and IMS tools developers are making use of those environments to provide tooling that will perform some of the standard DBA functions. This book will take some selected processes and show how this new tooling can be used. This will provide some productivity improvements and also provide a more familiar environment for new generations DBAs. Some of the functions normally done by DBA or console operators can now be done in this eclipse-based environment by the application developers. This means that the need to request these services from others can be eliminated. This IBM Redbooks® publication examines specific IMS DBA processes and highlights the new IMS and IMS tools features, which show an alternative way to accomplish those processes. Each chapter highlights a different area of the DBA processes like: PSB creation Starting/stopping a database in an IMS system Recovering a database Cloning a set of databases

JES3 to JES2 Migration Considerations

This book deals with the migration from JES3 to JES2. Part One describes this decision. Part Two describes the steps and considerations of this migration. This IBM® Redbooks® publication provides information to help clients that have JES3 and would like to migrate to JES2. It provides a comprehensive list of the differences between the two job entry subsystems and provides information to help you determine the migration effort and actions. The book is aimed at operations personnel, system programmers, and application developers.

Sample Size Calculations for Clustered and Longitudinal Outcomes in Clinical Research

This book explains how to determine sample size for studies with correlated outcomes, which are widely implemented in medical, epidemiological, and behavioral studies. For clustered studies, the authors provide sample size formulas that account for variable cluster sizes and within-cluster correlation. For longitudinal studies, they present sample size formulas that account for within-subject correlation among repeated measurements and various missing data patterns. For multiple levels of clustering, the authors describe how randomization impacts trial administration, analysis, and sample size requirement.