talk-data.com talk-data.com

Topic

data-science

2252

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

2252 activities · Newest first

Data Clean-Up and Management

Data use in the library has specific characteristics and common problems. Data Clean-up and Management addresses these, and provides methods to clean up frequently-occurring data problems using readily-available applications. The authors highlight the importance and methods of data analysis and presentation, and offer guidelines and recommendations for a data quality policy. The book gives step-by-step how-to directions for common dirty data issues. Focused towards libraries and practicing librarians Deals with practical, real-life issues and addresses common problems that all libraries face Offers cradle-to-grave treatment for preparing and using data, including download, clean-up, management, analysis and presentation

Python for Data Analysis

Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you’ll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language. Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It’s ideal for analysts new to Python and for Python programmers new to scientific computing. Use the IPython interactive shell as your primary development environment Learn basic and advanced NumPy (Numerical Python) features Get started with data analysis tools in the pandas library Use high-performance tools to load, clean, transform, merge, and reshape data Create scatter plots and static or interactive visualizations with matplotlib Apply the pandas groupby facility to slice, dice, and summarize datasets Measure data by points in time, whether it’s specific instances, fixed periods, or intervals Learn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples

Beginning R: An Introduction to Statistical Programming

Beginning R: An Introduction to Statistical Programming is a hands-on book showing how to use the R language, write and save R scripts, build and import data files, and write your own custom statistical functions. R is a powerful open-source implementation of the statistical language S, which was developed by AT&T. R has eclipsed S and the commercially-available S-Plus language, and has become the de facto standard for doing, teaching, and learning computational statistics. R is both an object-oriented language and a functional language that is easy to learn, easy to use, and completely free. A large community of dedicated R users and programmers provides an excellent source of R code, functions, and data sets. R is also becoming adopted into commercial tools such as Oracle Database. Your investment in learning R is sure to pay off in the long term as R continues to grow into the go to language for statistical exploration and research. Covers the freely-available R language for statistics Shows the use of R in specific uses case such as simulations, discrete probability solutions, one-way ANOVA analysis, and more Takes a hands-on and example-based approach incorporating best practices with clear explanations of the statistics being done What you'll learn Acquire and install R Import and export data and scripts Generate basic statistics and graphics Program in R to write custom functions Use R for interactive statistical explorations Implement simulations and other advanced techniques Who this book is for Beginning R: An Introduction to Statistical Programming is an easy-to-read book that serves as an instruction manual and reference for working professionals, professors, and students who want to learn and use R for basic statistics. It is the perfect book for anyone needing a free, capable, and powerful tool for exploring statistics and automating their use.

The Little SAS® Book: A Primer

A classic that just keeps getting better, The Little SAS Book The fifth edition has been completely updated to reflect the new default output introduced with SAS 9.3. In addition, there is a now a full chapter devoted to ODS Graphics including the SGPLOT and SGPANEL procedures. Other changes include expanded coverage of linguistic sorting and a new section on concatenating macro variables with other text. This title belongs on every SAS programmer's bookshelf. It's a resource not just to get you started, but one you'll return to as you continue to improve your programming skills.

R in a Nutshell, 2nd Edition

If you’re considering R for statistical computing and data visualization, this book provides a quick and practical guide to just about everything you can do with the open source R language and software environment. You’ll learn how to write R functions and use R packages to help you prepare, visualize, and analyze data. Author Joseph Adler illustrates each process with a wealth of examples from medicine, business, and sports. Updated for R 2.14 and 2.15, this second edition includes new and expanded chapters on R performance, the ggplot2 data visualization package, and parallel R computing with Hadoop. Get started quickly with an R tutorial and hundreds of examples Explore R syntax, objects, and other language details Find thousands of user-contributed R packages online, including Bioconductor Learn how to use R to prepare data for analysis Visualize your data with R’s graphics, lattice, and ggplot2 packages Use R to calculate statistical fests, fit models, and compute probability distributions Speed up intensive computations by writing parallel R programs for Hadoop Get a complete desktop reference to R

Service-Oriented Distributed Knowledge Discovery

A new approach to distributed large-scale data mining, service-oriented knowledge discovery extracts useful knowledge from often unmanageable volumes of data by exploiting data mining and machine learning distributed models and techniques in service-oriented infrastructures. Service-Oriented Distributed Knowledge Discovery presents techniques, algorithms, and systems based on the service-oriented paradigm. It explains how to design services for data analytics, describes real systems for implementing distributed knowledge discovery applications, and explores mobile data mining models.

The Pragmatic MBA for Scientific and Technical Executives

This primer enables professionals with technical expertise to collaborate with their business-side colleagues. Emphasizing brevity and clarity, it gives technical staff answers to their most pressing questions about economics, finance, marketing, strategic decision-making, accounting, management, and related subjects. It does not offer condensed 1st year MBA courses; instead, it presents streamlined concepts and insights that are easy enough to be accessible and challenging enough to hold one's interest. Its examples from pharma, IT, aircraft/navigation, and other industries highlight problems that technical professionals face daily. Written by "one of them," its credibility makes it more useful than Internet resources. Because it concentrates on pragmatic (as opposed to academic) approaches to business, it empowers technical staff to stay with the conversation--and take it to a higher level. Bertrand C. Liang, MD, PhD, MBA, is Managing Director of LCC Ventures and Executive Director of Pfenex, Inc. He is trained in molecular biology and genetics (PhD) and is a clinician (MD) with subspecialty training in neurology and oncology, and serves as a Visiting University Professor at Liaoning He University, Shenyang, China. Creates frameworks and builds concepts enabling technical staff to work with their business colleagues Delivers content for pragmatic, immediate use, not condensed presentations of subjects from first year MBA curriculum Extends readers' grasp by posting additional resources at a freely-available website

IBM Cognos Dynamic Cubes

IBM® Cognos® Business Intelligence (BI) provides a proven enterprise BI platform with an open data strategy, providing customers with the ability to leverage data from any source, package it into a business model, and make it available to consumers in various interfaces that are tailored to the task. IBM Cognos Dynamic Cubes complements the existing Cognos BI capabilities and continues the tradition of an open data model. It focuses on extending the scalability of the IBM Cognos platform to enable speed-of-thought analytics over terabytes of enterprise data, without having to invest in a new data warehouse appliance. This capability adds a new level of query intelligence so you can unleash the power of your enterprise data warehouse. This IBM Redbooks® publication addresses IBM Cognos Business Intelligence V10.2 and specifically, the IBM Cognos Dynamic Cubes capabilities. This book can help you in the following ways: Understand core features of the Dynamic Cubes capabilities of IBM Cognos BI V10.2 Learn by example with practical scenarios using the IBM Cognos samples

Statistical Monitoring of Complex Multivariate Processes: With Applications in Industrial Process Control

The development and application of multivariate statistical techniques in process monitoring has gained substantial interest over the past two decades in academia and industry alike. Initially developed for monitoring and fault diagnosis in complex systems, such techniques have been refined and applied in various engineering areas, for example mechanical and manufacturing, chemical, electrical and electronic, and power engineering. The recipe for the tremendous interest in multivariate statistical techniques lies in its simplicity and adaptability for developing monitoring applications. In contrast, competitive model, signal or knowledge based techniques showed their potential only whenever cost-benefit economics have justified the required effort in developing applications. Statistical Monitoring of Complex Multivariate Processes presents recent advances in statistics based process monitoring, explaining how these processes can now be used in areas such as mechanical and manufacturing engineering for example, in addition to the traditional chemical industry. This book: Contains a detailed theoretical background of the component technology. Brings together a large body of work to address the field's drawbacks, and develops methods for their improvement. Details cross-disciplinary utilization, exemplified by examples in chemical, mechanical and manufacturing engineering. Presents real life industrial applications, outlining deficiencies in the methodology and how to address them. Includes numerous examples, tutorial questions and homework assignments in the form of individual and team-based projects, to enhance the learning experience. Features a supplementary website including Matlab algorithms and data sets. This book provides a timely reference text to the rapidly evolving area of multivariate statistical analysis for academics, advanced level students, and practitioners alike.

Industrial Statistics with Minitab

Industrial Statistics with MINITAB demonstrates the use of MINITAB as a tool for performing statistical analysis in an industrial context. This book covers introductory industrial statistics, exploring the most commonly used techniques alongside those that serve to give an overview of more complex issues. A plethora of examples in MINITAB are featured along with case studies for each of the statistical techniques presented. Industrial Statistics with MINITAB: Provides comprehensive coverage of user-friendly practical guidance to the essential statistical methods applied in industry. Explores statistical techniques and how they can be used effectively with the help of MINITAB 16. Contains extensive illustrative examples and case studies throughout and assumes no previous statistical knowledge. Emphasises data graphics and visualization, and the most used industrial statistical tools, such as Statistical Process Control and Design of Experiments. Is supported by an accompanying website featuring case studies and the corresponding datasets. Six Sigma Green Belts and Black Belts will find explanations and examples of the most relevant techniques in DMAIC projects. The book can also be used as quick reference enabling the reader to be confident enough to explore other MINITAB capabilities.

Cody's Collection of Popular SAS Programming Tasks and How to Tackle Them

Cody's Collection of Popular SAS Programming Tasks and How to Tackle Them presents often-used programming tasks that readers can either use as presented or modify to fit their own programs, all in one handy volume. Esteemed author and SAS expert Ron Cody covers such topics as character to numeric conversion, automatic detection of numeric errors, combining summary data with detail data, restructuring a data set, grouping values using several innovative methods, performing an operation on all character or all numeric variables in a SAS data set, and much more! SAS users of all levels interested in improving their programming skills will benefit from this easy-to-follow collection of tasks.

This book is part of the SAS Press program.

Solving Business Problems with Informix TimeSeries

The world is becoming more and more instrumented, interconnected, and intelligent in what IBM® terms a smarter planet, with more and more data being collected for analysis. In trade magazines, this trend is called big data. As part of this trend, the following types of time-based information are collected: Large data centers support a corporation or provide cloud services. These data centers need to collect temperature, humidity, and other types of Utility meters (referred to as smart meters) allow utility companies to collect information over a wireless network and to collect more data than ever before. IBM Informix® TimeSeries is optimized for the processing of time-based data and can provide the following benefits: Storage savings: Storage can be optimized when you know the characteristics of your time-based data. Informix TimeSeries often uses one third of the storage space that is required by a standard relational database. Query performance: Informix TimeSeries takes into consideration the type of data to optimize its organization on disk and eliminates the need for some large indexes and additional sorting. For these reasons and more, some queries can easily have an order of magnitude performance improvement compared to standard relational. Simpler queries: Informix TimeSeries includes a large set of specialized functions that allow you to better express the processing that you want to execute. It even provides a toolkit so that you can add proprietary algoritms to the library. This IBM Redbooks® publication is for people who want to implement a solution that revolves around time-based data. It gives you the information that you need to get started and be productive with Informix TimeSeries.

Regression for Economics

Regression analysis is the most commonly used statistical method in the world. Although few would characterize this technique as simple, regression is in fact both simple and elegant. The complexity that many attribute to regression analysis is often a reflection of their lack of familiarity with the language of mathematics. But regression analysis can be understood even without a mastery of sophisticated mathematical concepts. This book provides the foundation and will help demystify regression analysis using examples from economics and with real data to show the applications of the method. The concepts related to regression analysis are explained in a way that is comprehensible to those whose mathematical skills are not matching that of the expert level, and uses Microsoft Excel to obtain regression results. What hinders peoples’ comprehension of regression analysis is the difficulty many have in understanding mathematical symbols and derivations. By removing this obstacle, this book enables the logical reader to learn regression without possessing superior mathematical skills.

SAS Hash Object Programming Made Easy

Hash objects, an efficient look-up tool in the SAS DATA step, are object-oriented programming structures that function differently from traditional SAS language statements. Michele Burlew's SAS Hash Object Programming Made Easy shows readers how to use these powerful features, which they can program to quickly look up and manage data and to conserve computing resources. SAS provides various look-up techniques, and hash objects are among the newest, so therefore many users may not have yet used them. Because the examples presented vary in complexity, SAS Hash Object Programming Made Easy is useful to SAS users of all experience levels, from novice programmer to advanced programmer. Novice programmers can adapt some of the simpler hash programming techniques as they develop their SAS programming skills. This book helps more experienced programmers learn how to take advantage of hash object programming by comparing traditional processing techniques to those that use hash objects. Additionally, users from diverse fields with different requirements can adapt the examples in SAS Hash Object Programming Made Easy to fit their unique situations.

This book is part of the SAS Press program.

Enterprise Analytics: Optimize Performance, Process, and Decisions Through Big Data

The Definitive Guide to Enterprise-Level Analytics Strategy, Technology, Implementation, and Management Organizations are capturing exponentially larger amounts of data than ever, and now they have to figure out what to do with it. Using analytics, you can harness this data, discover hidden patterns, and use this knowledge to act meaningfully for competitive advantage. Suddenly, you can go beyond understanding “how, when, and where” events have occurred, to understand why – and use this knowledge to reshape the future. Now, analytics pioneer Tom Davenport and the world-renowned experts at the International Institute for Analytics (IIA) have brought together the latest techniques, best practices, and research on analytics in a single primer for maximizing the value of enterprise data. Enterprise Analytics is today’s definitive guide to analytics strategy, planning, organization, implementation, and usage. It covers everything from building better analytics organizations to gathering data; implementing predictive analytics to linking analysis with organizational performance. The authors offer specific insights for optimizing supply chains, online services, marketing, fraud detection, and many other business functions. They support their powerful techniques with many real-world examples, including chapter-length case studies from healthcare, retail, and financial services. Enterprise Analytics will be an invaluable resource for every business and technical professional who wants to make better data-driven decisions: operations, supply chain, and product managers; product, financial, and marketing analysts; CIOs and other IT leaders; data, web, and data warehouse specialists, and many others.

Bayesian Statistics: An Introduction, 4th Edition

Bayesian Statistics is the school of thought that combines prior beliefs with the likelihood of a hypothesis to arrive at posterior beliefs. The first edition of Peter Lee's book appeared in 1989, but the subject has moved ever onwards, with increasing emphasis on Monte Carlo based techniques. This new fourth edition looks at recent techniques such as variational methods, Bayesian importance sampling, approximate Bayesian computation and Reversible Jump Markov Chain Monte Carlo (RJMCMC), providing a concise account of the way in which the Bayesian approach to statistics develops as well as how it contrasts with the conventional approach. The theory is built up step by step, and important notions such as sufficiency are brought out of a discussion of the salient features of specific examples. This edition: Includes expanded coverage of Gibbs sampling, including more numerical examples and treatments of OpenBUGS, R2WinBUGS and R2OpenBUGS. Presents significant new material on recent techniques such as Bayesian importance sampling, variational Bayes, Approximate Bayesian Computation (ABC) and Reversible Jump Markov Chain Monte Carlo (RJMCMC). Provides extensive examples throughout the book to complement the theory presented. Accompanied by a supporting website featuring new material and solutions. More and more students are realizing that they need to learn Bayesian statistics to meet their academic and professional goals. This book is best suited for use as a main text in courses on Bayesian statistics for third and fourth year undergraduates and postgraduate students.

Infographics: The Power of Visual Storytelling

Transform your marketing efforts through the power of visual content In today's fast-paced environment, you must communicate your message in a concise and engaging way that sets it apart from the noise. Visual content—such as infographics and data visualization—can accomplish this. With DIY functionality, Infographics: The Power of Visual Storytelling will teach you how to find stories in your data, and how to visually communicate and share them with your audience for maximum impact. Infographics will show you the vast potential to using the communication medium as a marketing tool by creating informative and shareable infographic content. Learn how to explain an object, idea, or process using strong illustration that captures interest and provides instant clarity Discover how to unlock interesting stories (in previously buried or boring data) and turn them into visual communications that will help build brands and increase sales Use the power of visual content to communicate with and engage your audience, capture attention, and expand your market.

How Data Science Is Transforming Health Care

In the early days of the 20th century, department store magnate JohnWanamaker famously said, "I know that half of my advertising doesn'twork. The problem is that I don't know which half." That remainedbasically true until Google transformed advertising with AdSense basedon new uses of data and analysis. The same might be said about healthcare and it's poised to go through a similar transformation as newtools, techniques, and data sources come on line. Soon we'll makepolicy and resource decisions based on much better understanding ofwhat leads to the best outcomes, and we'll make medical decisionsbased on a patient's specific biology. The result will be betterhealth at less cost. This paper explores how data analysis will help us structure thebusiness of health care more effectively around outcomes, and how itwill transform the practice of medicine by personalizing for eachspecific patient.

SAP BusinessObjects BI 4.0 The Complete Reference 3/E

The definitive reference for building actionable business intelligence—completely revised for SAP BusinessObjects BI 4.0. Unleash the full potential of business intelligence with fact-based decisions, aligned to business goals, using reports and dashboards that lead from insight to action. SAP BusinessObjects BI 4.0: The Complete Reference offers completely updated coverage of the latest BI platform. Find out how to work with the new Information Design Tool to create universes that access multiple data sources and SAP BW. See how to translate complex business questions into highly efficient Web Intelligence queries and publish your results to the BI Launchpad. Learn how to create dashboards from data sourced through a universe or spreadsheet. The most important concepts for universe designers, report and dashboard authors, and business analysts are fully explained and illustrated by screenshots, diagrams, and step-by-step instructions. Establish and evolve BI goals Maximize your BI investments by offering the right module to the right user Create robust universes with the Information Design Tool, leveraging multiple data sources, derived tables, aggregate awareness, and parameters Develop a security plan that is scalable and flexible Design Web Intelligence reports from basic to advanced Create sophisticated calculations and advanced formatting to highlight critical business trends Build powerful dashboards to embed in PowerPoint or the BI Launchpad Use Explorer to visually navigate large data sets and uncover patterns

The Functional Art: An introduction to information graphics and visualization

Unlike any time before in our lives, we have access to vast amounts of free information. With the right tools, we can start to make sense of all this data to see patterns and trends that would otherwise be invisible to us. By transforming numbers into graphical shapes, we allow readers to understand the stories those numbers hide. In this practical introduction to understanding and using information graphics, you’ll learn how to use data visualizations as tools to see beyond lists of numbers and variables and achieve new insights into the complex world around us. Regardless of the kind of data you’re working with–business, science, politics, sports, or even your own personal finances–this book will show you how to use statistical charts, maps, and explanation diagrams to spot the stories in the data and learn new things from it. Condé Nast Traveler’s John Grimwade , National Geographic Magazine’s Fernando Baptista, The New York Times’ Steve Duenes, The Washington Post’s Hannah Fairfield, Hans Rosling of the Gapminder Foundation, Stanford’s Geoff McGhee, and European superstars Moritz Stefaner, Jan Willem Tulp, Stefanie Posavec, and Gregor Aisch. The Functional Art reveals: In this introductory course on information graphics, Alberto Cairo goes into greater detail with even more visual examples of how to create effective information graphics that function as practical tools for aiding perception. You’ll learn how to: incorporate basic design principles in your visualizations, create simple interfaces for interactive graphics, and choose the appropriate type of graphic forms for your data. Cairo also deconstructs successful information graphics from The New York Times and National Geographic magazine with sketches and images not shown in the book.