talk-data.com talk-data.com

Topic

data-science

2252

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

2252 activities · Newest first

Private and Open Data in Asia: A Regional Guide

The rise of big data in recent years coincides with the economic and political rise of Asia, especially among the five countries that make up the bulk of the East Asian Internet-using population: China, Japan, Korea, India, and Indonesia. If you’re thinking of entering the Asian market, this O’Reilly report provides an overview of the current state of big data and open data in these countries, and helps you examine whether the benefits of doing business with them outweigh the costs. While Japan and South Korea are highly developed countries with lofty Internet penetration rates, China, India, and Indonesia have enormous populations, relatively low Internet penetration, and enormous growth potential. But access to open data from fields such as healthcare, education, agriculture, transportation, energy, and finance—data vital for building businesses and services—varies from country to country. Each of them has a distinctive character reflecting its national priorities. To help you assess risk vs opportunity in the Asian market, author Franklin Lu reviews these five countries individually to reveal the nature of data privacy laws, open data initiatives, and existing businesses.

Sharpening Your Advanced SAS Skills

This guide presents sophisticated SAS programming techniques, procedures, and tools, such as Proc SQL, hash tables, and SAS Macro programming, for any industry. It empowers both advanced programmers who need a quick refresher and programmers interested in learning new techniques. It shows how to take advantage of the latest SAS options and new SAS procedures. The book illustrates syntax with simple, common task-oriented examples and prepares readers for the advanced SAS certification exam. Mindmaps and process flowcharts are available on the author's website.

The Definitive Guide to DAX: Business intelligence with Microsoft Excel, SQL Server Analysis Services, and Power BI

This comprehensive and authoritative guide will teach you the DAX language for business intelligence, data modeling, and analytics. Leading Microsoft BI consultants Marco Russo and Alberto Ferrari help you master everything from table functions through advanced code and model optimization. You’ll learn exactly what happens under the hood when you run a DAX expression, how DAX behaves differently from other languages, and how to use this knowledge to write fast, robust code. If you want to leverage all of DAX’s remarkable power and flexibility, this no-compromise “deep dive” is exactly what you need. Perform powerful data analysis with DAX for Microsoft SQL Server Analysis Services, Excel, and Power BI Master core DAX concepts, including calculated columns, measures, and error handling Understand evaluation contexts and the CALCULATE and CALCULATETABLE functions Perform time-based calculations: YTD, MTD, previous year, working days, and more Work with expanded tables, complex functions, and elaborate DAX expressions Perform calculations over hierarchies, including parent/child hierarchies Use DAX to express diverse and unusual relationships Measure DAX query performance with SQL Server Profiler and DAX Studio

Dashboards for Excel

The book takes a hands-on approach to developing dashboards, from instructing users on advanced Excel techniques to addressing dashboard pitfalls common in the real world. Dashboards for Excel is your key to creating informative, actionable, and interactive dashboards and decision support systems. Throughout the book, the reader is challenged to think about Excel and data analytics differently—that is, to think outside the cell. This book shows you how to create dashboards in Excel quickly and effectively. In this book, you learn how to: Apply data visualization principles for more effective dashboards Employ dynamic charts and tables to create dashboards that are constantly up-to-date and providing fresh information Use understated yet powerful formulas for Excel development Apply advanced Excel techniques mixing formulas and Visual Basic for Applications (VBA) to create interactive dashboards Create dynamic systems for decision support in your organization Avoid common problems in Excel development and dashboard creation Get started with the Excel data model, PowerPivot, and Power Query

Mastering Data analysis with R

Unlock the full potential of the R programming language with 'Mastering Data Analysis with R'. This book takes you from basic data manipulation to advanced visualization and modeling techniques, providing hands-on guidance to solve real-world data science challenges. What this Book will help me do Efficiently manipulate and clean large datasets using R techniques. Build and evaluate statistical models and machine learning algorithms. Visualize data insights through compelling graphics and visualizations. Analyze social networks and graph data within R's environment. Perform geospatial data analysis with specialized R packages. Author(s) None Daróczi is a seasoned data scientist and R developer with extensive industry and academic experience. He specializes in employing R for sophisticated data analysis tasks and visualization. His approachable writing style, combined with in-depth technical expertise, ensures learners of varying levels can connect with and benefit from his materials. Who is it for? This book is ideal for data scientists, statisticians, and analysts who are familiar with basics of R and want to deepen their expertise. If you are looking to learn practical applications of advanced R capabilities for data wrangling, modeling, and visualization, this is for you. It suits professionals aiming to implement data-driven solutions and empowers them to make informed decisions with R's tools. Find practical techniques to elevate your data analysis proficiency here.

Building a Recommendation System with R

Dive into building recommendation systems with R in this comprehensive guide. You will learn about data mining, machine learning, and how R's powerful libraries and tools can be utilized to create efficient and optimized recommendation engines. By the end of this book, you will have the expertise to develop custom solutions tailored to specific data and user cases. What this Book will help me do Master the foundations of recommendation systems and their applications. Understand and implement essential data preprocessing techniques. Learn to optimize recommendation algorithms for better efficiency. Explore the use of the recommenderlab package in R for building models. Gain hands-on experience through a complete case study building a recommendation engine. Author(s) None Usuelli is a seasoned data scientist and R programming enthusiast passionate about machine learning and data analysis. They have extensive experience in developing recommendation systems for various industries, leveraging the power of R for robust solutions. None's clear teaching approach makes complex concepts accessible to learners of all levels. Who is it for? This book is ideal for developers who already possess a fundamental understanding of R and basic machine learning principles. If you aim to deepen your knowledge in creating advanced recommendation systems and practically apply these concepts, this book is the perfect resource for you. It is an excellent guide for professionals looking to specialize in predictive analytics and systems design.

Data Analysis in the Cloud

Data Analysis in the Cloud introduces and discusses models, methods, techniques, and systems to analyze the large number of digital data sources available on the Internet using the computing and storage facilities of the cloud. Coverage includes scalable data mining and knowledge discovery techniques together with cloud computing concepts, models, and systems. Specific sections focus on map-reduce and NoSQL models. The book also includes techniques for conducting high-performance distributed analysis of large data on clouds. Finally, the book examines research trends such as Big Data pervasive computing, data-intensive exascale computing, and massive social network analysis. Introduces data analysis techniques and cloud computing concepts Describes cloud-based models and systems for Big Data analytics Provides examples of the state-of-the-art in cloud data analysis Explains how to develop large-scale data mining applications on clouds Outlines the main research trends in the area of scalable Big Data analysis

2015 Data Science Salary Survey

For the third consecutive year, O’Reilly Media conducted an anonymous survey to expose the tools that successful data scientists and engineers use, and how those tool choices might relate to their salary. For the 2015 version of the Data Science Salary Survey, we heard from over 600 respondents who work in and around the data space for a variety of industries across 47 countries and 38 U.S. states. The research was based on data collected through an online 32-question survey, including demographic information, time spent on various data-related tasks, and the use or non-use of 116 software tools. Findings include: Download this free in-depth report to gain insight from these potentially career-changing findings, and plug your own variables into one of the linear models to predict your own salary. Average number of tools and median income for all respondents Distribution of responses by a variety of factors, including age, gender, location, industry, role, and cloud computing Detailed analysis of tool use, including tool clusters Correlation of tool usage and salary The survey is now open for the 2016 report, and it takes just 5 to 10 minutes to complete: http://www.oreilly.com/go/ds-salary-​survey-2016.

Inferential Models

This book introduces the authors' recently developed approach to inference: the inferential model (IM) framework. This logical framework for exact probabilistic inference does not require the user to input prior information. The book covers the foundational motivations for this new approach, the basic theory behind its calibration properties, many important applications, and new directions for research. It explores a new way of thinking compared to existing schools of thought on statistical inference and encourages readers to think carefully about the correct approach to scientific inference.

Beginning Big Data with Power BI and Excel 2013

In Beginning Big Data with Power BI and Excel 2013, you will learn to solve business problems by tapping the power of Microsoft’s Excel and Power BI to import data from NoSQL and SQL databases and other sources, create relational data models, and analyze business problems through sophisticated dashboards and data-driven maps. While Beginning Big Data with Power BI and Excel 2013 covers prominent tools such as Hadoop and the NoSQL databases, it recognizes that most small and medium-sized businesses don’t have the Big Data processing needs of a Netflix, Target, or Facebook. Instead, it shows how to import data and use the self-service analytics available in Excel with Power BI. As you’ll see through the book’s numerous case examples, these tools—which you already know how to use—can perform many of the same functions as the higher-end Apache tools many people believe are required to carry out in Big Data projects. Through instruction, insight, advice, and case studies, Beginning Big Data with Power BI and Excel 2013 will show you how to: Import and mash up data from web pages, SQL and NoSQL databases, the Azure Marketplace and other sources. Tap into the analytical power of PivotTables and PivotCharts and develop relational data models to track trends and make predictions based on a wide range of data. Understand basic statistics and use Excel with PowerBI to do sophisticated statistical analysis—including identifying trends and correlations. Use SQL within Excel to do sophisticated queries across multiple tables, including NoSQL databases. Create complex formulas to solve real-world business problems using Data Analysis Expressions (DAX).

Advanced R

An Essential Reference for Intermediate and Advanced R Programmers Advanced R presents useful tools and techniques for attacking many types of R programming problems, helping you avoid mistakes and dead ends. With more than ten years of experience programming in R, the author illustrates the elegance, beauty, and flexibility at the heart of R. The book develops the necessary skills to produce quality code that can be used in a variety of circumstances. You will learn: The fundamentals of R, including standard data types and functions Functional programming as a useful framework for solving wide classes of problems The positives and negatives of metaprogramming How to write fast, memory-efficient code This book not only helps current R users become R programmers but also shows existing programmers what’s special about R. Intermediate R programmers can dive deeper into R and learn new strategies for solving diverse problems while programmers from other languages can learn the details of R and understand why R works the way it does.

Data Analytics in Sports

As any child with a baseball card intuitively knows, sports and statistics go hand-in-hand. Yet, the general media disdain the flood of sports statistics available today: sports are pure and analytic tools are not. Well, if the so-called purists find tools like baseball’s sabermetrics upsetting, then they’d better brace themselves for the new wave of data analytics. In this O’Reilly report, Janine Barlow examines how advanced predictive analytics are impacting the world of sports—from the rise of tools such as Major League Baseball’s Statcast, which collects data on the movement of balls and players, to SportVU, which the National Basketball Association uses to collect spatial analysis data. You’ll also learn: How "Dance Card" makes accurate predictions about NCAA’s "March Madness" tournament Why data is crumbling long-standing myths about performance in soccer How the National Football League is using wearable devices to collect vital health data about its players It’s a new world in sports, where data analytics and related information technologies are changing the experience for teams, players, fans, and investors.

Introduction to Probability

Developed from celebrated Harvard statistics lectures, Introduction to Probability provides essential language and tools for understanding statistics, randomness, and uncertainty. The book explores a wide variety of applications and examples, ranging from coincidences and paradoxes to Google PageRank and Markov chain Monte Carlo (MCMC). Additional application areas explored include genetics, medicine, computer science, and information theory. The print book version includes a code that provides free access to an eBook version. The authors present the material in an accessible style and motivate concepts using real-world examples. Throughout, they use stories to uncover connections between the fundamental distributions in statistics and conditioning to reduce complicated problems to manageable pieces. The book includes many intuitive explanations, diagrams, and practice problems. Each chapter ends with a section showing how to perform relevant simulations and calculations in R, a free statistical software environment.

Search-Driven Business Analytics

Compared to the speed and convenience of major web search engines, most business intelligence (BI) products are slow, stiff, and unresponsive. Business leaders today often wait days or weeks to get BI reports on inquiries about customers, products, or markets. But the latest BI products show that a significant change is taking place—a change led by search. This O’Reilly report examines three recent products with intelligent search capabilities: the ThoughtSpot Analytical Search Appliance, Microsoft’s Power BI service, and an offering from Adatao. You’ll learn how these products can provide you with answers and visualizations as quickly as questions come to mind. You’ll investigate: The convergence of BI and search What a search-driven user experience looks like The intelligence required for analytical search Data sources and their associated data modeling requirements Turning on-the-fly calculations into visualizations Applying enterprise scale and security to search

Methods and Applications of Longitudinal Data Analysis

Methods and Applications of Longitudinal Data Analysis describes methods for the analysis of longitudinal data in the medical, biological and behavioral sciences. It introduces basic concepts and functions including a variety of regression models, and their practical applications across many areas of research. Statistical procedures featured within the text include: descriptive methods for delineating trends over time linear mixed regression models with both fixed and random effects covariance pattern models on correlated errors generalized estimating equations nonlinear regression models for categorical repeated measurements techniques for analyzing longitudinal data with non-ignorable missing observations Emphasis is given to applications of these methods, using substantial empirical illustrations, designed to help users of statistics better analyze and understand longitudinal data. Methods and Applications of Longitudinal Data Analysis equips both graduate students and professionals to confidently apply longitudinal data analysis to their particular discipline. It also provides a valuable reference source for applied statisticians, demographers and other quantitative methodologists. From novice to professional: this book starts with the introduction of basic models and ends with the description of some of the most advanced models in longitudinal data analysis Enables students to select the correct statistical methods to apply to their longitudinal data and avoid the pitfalls associated with incorrect selection Identifies the limitations of classical repeated measures models and describes newly developed techniques, along with real-world examples.

An Introduction to Probability and Statistics, 3rd Edition

A well-balanced introduction to probability theory and mathematical statistics Featuring updated material, An Introduction to Probability and Statistics, Third Edition remains a solid overview to probability theory and mathematical statistics. Divided intothree parts, the Third Edition begins by presenting the fundamentals and foundationsof probability. The second part addresses statistical inference, and the remainingchapters focus on special topics. An Introduction to Probability and Statistics, Third Edition includes: A new section on regression analysis to include multiple regression, logistic regression, and Poisson regression A reorganized chapter on large sample theory to emphasize the growing role of asymptotic statistics Additional topical coverage on bootstrapping, estimation procedures, and resampling Discussions on invariance, ancillary statistics, conjugate prior distributions, and invariant confidence intervals Over 550 problems and answers to most problems, as well as 350 worked out examples and 200 remarks Numerous figures to further illustrate examples and proofs throughout An Introduction to Probability and Statistics, Third Edition is an ideal reference and resource for scientists and engineers in the fields of statistics, mathematics, physics, industrial management, and engineering. The book is also an excellent text for upper-undergraduate and graduate-level students majoring in probability and statistics.

Fundamentals of Statistical Experimental Design and Analysis

Professionals in all areas - business; government; the physical, life, and social sciences; engineering; medicine, etc. - benefit from using statistical experimental design to better understand their worlds and then use that understanding to improve the products, processes, and programs they are responsible for. This book aims to provide the practitioners of tomorrow with a memorable, easy to read, engaging guide to statistics and experimental design. This book uses examples, drawn from a variety of established texts, and embeds them in a business or scientific context, seasoned with a dash of humor, to emphasize the issues and ideas that led to the experiment and the what-do-we-do-next? steps after the experiment. Graphical data displays are emphasized as means of discovery and communication and formulas are minimized, with a focus on interpreting the results that software produce. The role of subject-matter knowledge, and passion, is also illustrated. The examples do not require specialized knowledge, and the lessons they contain are transferrable to other contexts. Fundamentals of Statistical Experimental Design and Analysis introduces the basic elements of an experimental design, and the basic concepts underlying statistical analyses. Subsequent chapters address the following families of experimental designs: Completely Randomized designs, with single or multiple treatment factors, quantitative or qualitative Randomized Block designs Latin Square designs Split-Unit designs Repeated Measures designs Robust designs Optimal designs Written in an accessible, student-friendly style, this book is suitable for a general audience and particularly for those professionals seeking to improve and apply their understanding of experimental design.

Statistics for Big Data For Dummies

The fast and easy way to make sense of statistics for big data Does the subject of data analysis make you dizzy? You've come to the right place! Statistics For Big Data For Dummies breaks this often-overwhelming subject down into easily digestible parts, offering new and aspiring data analysts the foundation they need to be successful in the field. Inside, you'll find an easy-to-follow introduction to exploratory data analysis, the lowdown on collecting, cleaning, and organizing data, everything you need to know about interpreting data using common software and programming languages, plain-English explanations of how to make sense of data in the real world, and much more. Data has never been easier to come by, and the tools students and professionals need to enter the world of big data are based on applied statistics. While the word "statistics" alone can evoke feelings of anxiety in even the most confident student or professional, it doesn't have to. Written in the familiar and friendly tone that has defined the For Dummies brand for more than twenty years, Statistics For Big Data For Dummies takes the intimidation out of the subject, offering clear explanations and tons of step-by-step instruction to help you make sense of data mining—without losing your cool. Helps you to identify valid, useful, and understandable patterns in data Provides guidance on extracting previously unknown information from large databases Shows you how to discover patterns available in big data Gives you access to the latest tools and techniques for working in big data If you're a student enrolled in a related Applied Statistics course or a professional looking to expand your skillset, Statistics For Big Data For Dummies gives you access to everything you need to succeed.

The Art and Science of Analyzing Software Data

The Art and Science of Analyzing Software Data provides valuable information on analysis techniques often used to derive insight from software data. This book shares best practices in the field generated by leading data scientists, collected from their experience training software engineering students and practitioners to master data science. The book covers topics such as the analysis of security data, code reviews, app stores, log files, and user telemetry, among others. It covers a wide variety of techniques such as co-change analysis, text analysis, topic analysis, and concept analysis, as well as advanced topics such as release planning and generation of source code comments. It includes stories from the trenches from expert data scientists illustrating how to apply data analysis in industry and open source, present results to stakeholders, and drive decisions. Presents best practices, hints, and tips to analyze data and apply tools in data science projects Presents research methods and case studies that have emerged over the past few years to further understanding of software data Shares stories from the trenches of successful data science initiatives in industry

Medical Information Systems Ethics

The exponential digitization of medical data has led to a transformation of the practice of medicine. This change notably raises a new complexity of issues surrounding health IT. The proper use of these communication tools, such as telemedicine, e-health, m-health the big medical data, should improve the quality of monitoring and care of patients for an information system to "human face". Faced with these challenges, the author analyses in an ethical angle the patient-physician relationship, sharing, transmission and storage of medical information, setting pins to an ethic for the digitization of medical information. Drawing on good practice recommendations closely associated with values, this model is developing tools for reflection and present the keys to understanding the decision-making issues that reflect both the technological constraints and the complex nature of human reality in medicine .