talk-data.com talk-data.com

Event

O'Reilly Data Science Books

2013-08-09 – 2026-02-25 Oreilly Visit website ↗

Activities tracked

2118

Collection of O'Reilly books on Data Science.

Sessions & talks

Showing 1076–1100 of 2118 · Newest first

Search within this event →
The Book of R

The Book of R is a comprehensive, beginner-friendly guide to R, the world's most popular programming language for statistical analysis. Even if you have no programming experience and little more than a grounding in the basics of mathematics, you'll find everything you need to begin using R effectively for statistical analysis. You'll start with the basics, like how to handle data and write simple programs, before moving on to more advanced topics, like producing statistical summaries of your data and performing statistical tests and modeling. You'll even learn how to create impressive data visualizations with R's basic graphics tools and contributed packages, like ggplot2 and ggvis, as well as interactive 3D visualizations using the rgl package. Dozens of hands-on exercises (with downloadable solutions) take you from theory to practice, as you learn: The fundamentals of programming in R, including how to write data frames, create functions, and use variables, statements, and loops Statistical concepts like exploratory data analysis, probabilities, hypothesis tests, and regression modeling, and how to execute them in R How to access R's thousands of functions, libraries, and data sets How to draw valid and useful conclusions from your data How to create publication-quality graphics of your resultsCombining detailed explanations with real-world examples and exercises, this book will provide you with a solid understanding of both statistics and the depth of R's functionality. Make The Book of R your doorway into the growing world of data analysis.

Excel Sales Forecasting For Dummies, 2nd Edition

Choose, manage, and present data Select the right forecasting method for your business Use moving averages and predict seasonal sales Create sales forecasts you can trust You don't need magic, luck, or an advanced math degree to develop reliable sales forecasts; you just need Excel and this book! This guide explains how forecasting works and how to use the tools built into Excel. You'll learn how to choose your data, set up tables, chart your baseline, to create both basic and advanced forecasts you can really use. Inside... Prevent common issues Why baselines matter How to organize your data Tips on setting up tables Working with pivot charts How to forecast seasonal sales revenue Forecasting with regression

Quantifying the User Experience, 2nd Edition

Quantifying the User Experience: Practical Statistics for User Research, Second Edition, provides practitioners and researchers with the information they need to confidently quantify, qualify, and justify their data. The book presents a practical guide on how to use statistics to solve common quantitative problems that arise in user research. It addresses questions users face every day, including, Is the current product more usable than our competition? Can we be sure at least 70% of users can complete the task on their first attempt? How long will it take users to purchase products on the website? This book provides a foundation for statistical theories and the best practices needed to apply them. The authors draw on decades of statistical literature from human factors, industrial engineering, and psychology, as well as their own published research, providing both concrete solutions (Excel formulas and links to their own web-calculators), along with an engaging discussion on the statistical reasons why tests work and how to effectively communicate results. Throughout this new edition, users will find updates on standardized usability questionnaires, a new chapter on general linear modeling (correlation, regression, and analysis of variance), with updated examples and case studies throughout. Completely updated to provide practical guidance on solving usability testing problems with statistics for any project, including those using Six Sigma practices Includes new and revised information on standardized usability questionnaires Includes a completely new chapter introducing correlation, regression, and analysis of variance Shows practitioners which test to use, why they work, and best practices for application, along with easy-to-use Excel formulas and web-calculators for analyzing data Recommends ways for researchers and practitioners to communicate results to stakeholders in plain English

Statistics, 3E

Statistics is a class that is required in many college majors, and it's an increasingly popular Advanced Placement high school course. In addition to math and technical students, many business and liberal arts students are required to take it as a fundamental component of their majors. A knowledge of statistical interpretation is vital for many careers. Idiot's Guides: Statistics explains the fundamental tenets in language anyone can understand. Content includes: - Calculating descriptive statistics - Measures of central tendency: mean, median, and mode - Probability - Variance analysis - Inferential statistics - Hypothesis testing - Organizing data into statistical charts and tables

Introducing Microsoft Power BI

Get started quickly with Microsoft Power BI! Experts Alberto Ferrari and Marco Russo will help you bring your data to life, transforming your company’s data into rich visuals for you to collect and organize, allowing you to focus on what matters most to you. Stay in the know, spot trends as they happen, and push your business to new limits. This free ebook introduces Microsoft Power BI basics through a practical, scenario-based guided tour of the tool, showing you how to build analytical solutions using Power BI. Read the ebook to get an overview of Power BI, or dig deeper and follow along on your PC using the book’s examples. Introducing Microsoft Power BI enables you to evaluate when and how to use Power BI. Get inspired to improve business processes in your company by leveraging the available analytical and collaborative features of this environment. Be sure to watch for the publication of Alberto Ferrari and Marco Russo’s upcoming retail book, Analyzing Data with Power BI and Power Pivot for Excel (ISBN 9781509302765). Go to the book’s page at the Microsoft Press Store here for more details: http://aka.ms/analyzingdata/details. Learn more about Power BI at https://powerbi.microsoft.com/. .

Practical D3.js

Your indispensable guide to mastering the efficient use of D3.js in professional-standard data visualization projects. You will learn what data visualization is, how to work with it, and how to think like a D3.js expert, both practically and theoretically. Practical D3.js does not just show you how to use D3.js, it teaches you how to think like a data scientist and work with the data in the real world. In Part One, you will learn about theories behind data visualization. In Part Two, you will learn how to use D3.js to create the best charts and layouts. Uniquely, this book intertwines the technical details of D3.js with practical topics such as data journalism and the use of open government data. Written by leading data scientists Tarek Amr and Rayna Stamboliyska, this book is your guide to using D3.js in the real world -- add it to your library today. You Will Learn: How to think like a data scientist and present data in the best way What structure and design strategies you can use for compelling data visualization How to use data binding, animations and events, scales, and color pickers How to use shapes, path generators, arcs and polygons Who This Book is For: This book is for anyone who wants to learn to master the use of D3.js in a practical manner, while still learning the important theoretical aspects needed to enable them to work with their data in the best possible way.

Probability and Statistics with Reliability, Queuing, and Computer Science Applications, 2nd Edition

An accessible introduction to probability, stochastic processes, and statistics for computer science and engineering applications This updated and revised edition of the popular classic relates fundamental concepts in probability and statistics to the computer sciences and engineering. The author uses Markov chains and other statistical tools to illustrate processes in reliability of computer systems and networks, fault tolerance, and performance. This edition features an entirely new section on stochastic Petri nets?as well as new sections on system availability modeling, wireless system modeling, numerical solution techniques for Markov chains, and software reliability modeling, among other subjects. Extensive revisions take new developments in solution techniques and applications into account and bring this work totally up to date. It includes more than 200 worked examples and self-study exercises for each section. Probability and Statistics with Reliability, Queuing and Computer Science Applications, Second Edition offers a comprehensive introduction to probability, stochastic processes, and statistics for students of computer science, electrical and computer engineering, and applied mathematics. Its wealth of practical examples and up-to-date information makes it an excellent resource for practitioners as well. An Instructor's Manual presenting detailed solutions to all the problems in the book is available from the Wiley editorial department.

Simulation for Data Science with R

"Simulation for Data Science with R" introduces data professionals to fundamental and advanced simulation techniques using R. You'll understand essential statistical modeling concepts and learn to apply simulation methods to tackle data challenges and enhance your decision-making skills. What this Book will help me do Master five popular simulation methodologies including Monte Carlo and Agent-Based Modeling. Learn to simulate real-world data to uncover patterns and enhance predictions. Enhance your R programming expertise by exploring its advanced statistical features. Gain hands-on experience solving statistical problems through practical examples. Develop comprehensive statistical models aimed at real-world decision support. Author(s) Matthias Templ is a seasoned data science expert with extensive experience in statistical modeling and simulations using R. His work is rooted in real-world problem solving, outlining frameworks that are practical and research-driven. With a dedication to education, Matthias conveys his knowledge in an accessible and supportive manner. Who is it for? If you're experienced in computational methods and wish to refine your understanding of R for advanced statistical simulations, this book is for you. It's ideal for analysts or scientists aiming to enhance their decision-making with simulated data models. Prior experience with R is recommended to fully engage with the rigorous concepts presented.

Data Mining Models

Data mining has become the fastest growing topic of interest in business programs in the past decade. This book is intended to describe the benefits of data mining in business, the process and typical business applications, the workings of basic data mining models, and demonstrate each with widely available free software. The book focuses on demonstrating common business data mining applications. It provides exposure to the data mining process, to include problem identification, data management, and available modeling tools. The book takes the approach of demonstrating typical business data sets with open source software. KNIME is a very easy-to-use tool, and is used as the primary means of demonstration. R is much more powerful and is a commercially viable data mining tool. We also demonstrate WEKA, which is a highly useful academic software, although it is difficult to manipulate test sets and new cases, making it problematic for commercial use.

Mastering Python Data Analysis

Mastering Python Data Analysis provides a comprehensive roadmap for Python developers to enhance their data analysis skills to tackle real-world problems. This book delves into advanced statistical analysis, covering tools, models, and methods to transform raw data into valuable insights. What this Book will help me do Effectively handle and preprocess data using Python and Pandas. Explore statistical models to identify patterns and gain insights from data. Learn clustering approaches to detect data groupings and predict outcomes. Utilize Bayesian methods for quantifying causal relationships. Generate professional reports and visualizations with Python tools like Jupyter Notebook. Author(s) None Vilhelm Persson is a seasoned software developer and data analyst with expertise in leveraging Python for sophisticated data analysis and machine learning tasks. Drawing from years of experience in the tech industry, None provides practical, real-world insights throughout the book. His approachable writing style ensures technical concepts are conveyed with clarity, making data analysis accessible to developers at varying skill levels. Who is it for? This book is ideal for intermediate Python developers seeking to elevate their data analysis skills. If you are familiar with Python libraries and have an interest in solving complex data problems, this guide will serve as a stepping stone to mastery. Advanced beginners with a curiosity for statistical methods and a desire to learn through practical examples will find this book invaluable. It is also perfect for professionals aiming to integrate Python-based statistical techniques into their workflow.

Theory and Methods of Statistics

Theory and Methods of Statistics covers essential topics for advanced graduate students and professional research statisticians. This comprehensive resource covers many important areas in one manageable volume, including core subjects such as probability theory, mathematical statistics, and linear models, and various special topics, including nonparametrics, curve estimation, multivariate analysis, time series, and resampling. The book presents subjects such as "maximum likelihood and sufficiency," and is written with an intuitive, heuristic approach to build reader comprehension. It also includes many probability inequalities that are not only useful in the context of this text, but also as a resource for investigating convergence of statistical procedures. Codifies foundational information in many core areas of statistics into a comprehensive and definitive resource Serves as an excellent text for select master’s and PhD programs, as well as a professional reference Integrates numerous examples to illustrate advanced concepts Includes many probability inequalities useful for investigating convergence of statistical procedures

Applied Regression and Modeling

The book is divided into three parts – (1) prerequisite to regression analysis followed by a discussion on simple regression, (2) multiple regression analysis with applications, and (3) regression and modeling including the second order models, nonlinear regression, and interaction models in regressions. All these sections provide examples with complete computer analysis and instructions commonly used in modeling and analyzing these problems. The book deals with detailed analysis and interpretation of computer results. This will help readers to appreciate the power of computer in applying regression models. The readers will find that the understanding of computer results is critical to implementing regression and modeling in real world situation. The book is written for juniors, seniors and graduate students in business, MBAs, professional MBAs, and working people in business and industry. Managers, practitioners, professionals, quality professionals, quality engineers, and anyone involved in data analysis, business analytics, and quality and six sigma will find the book to be a valuable resource.

Advancing Procurement Analytics

One area where data analytics can have profound effect is your company’s procurement process. Some organizations spend more than two thirds of their revenue buying goods and services, making procurement—out of all business activities—a key element in achieving cost reduction. This report examines how your company can significantly improve procurement analytics to solve business questions quickly and effectively. Author Federico Castanedo, Chief Data Scientist at WiseAthena.com, explains how a probabilistic, bottom-up approach can significantly increase the quality, speed, and scalability of your data preparation operations—whether you’re integrating datasets or cleaning and classifying them. You’ll learn how new solutions leverage automation and machine learning, including the Tamr platform, and help you take advantage of several data-driven actions for procurement—including compliance, price arbitrage, and spend recovery.

The Data Industry

Provides an introduction of the data industry to the field of economics This book bridges the gap between economics and data science to help data scientists understand the economics of big data, and enable economists to analyze the data industry. It begins by explaining data resources and introduces the data asset. This book defines a data industry chain, enumerates data enterprises’ business models versus operating models, and proposes a mode of industrial development for the data industry. The author describes five types of enterprise agglomerations, and multiple industrial cluster effects. A discussion on the establishment and development of data industry related laws and regulations is provided. In addition, this book discusses several scenarios on how to convert data driving forces into productivity that can then serve society. This book is designed to serve as a reference and training guide for ata scientists, data-oriented managers and executives, entrepreneurs, scholars, and government employees. Defines and develops the concept of a “Data Industry,” and explains the economics of data to data scientists and statisticians Includes numerous case studies and examples from a variety of industries and disciplines Serves as a useful guide for practitioners and entrepreneurs in the business of data technology The Data Industry: The Business and Economics of Information and Big Data is a resource for practitioners in the data science industry, government, and students in economics, business, and statistics. CHUNLEI TANG, Ph.D., is a research fellow at Harvard University. She is the co-founder of Fudan’s Institute for Data Industry and proposed the concept of the “data industry”. She received a Ph.D. in Computer and Software Theory in 2012 and a Master of Software Engineering in 2006 from Fudan University, Shanghai, China.

Python: Real-World Data Science

Unleash the power of Python and its robust data science capabilities About This Book Unleash the power of Python 3 objects Learn to use powerful Python libraries for effective data processing and analysis Harness the power of Python to analyze data and create insightful predictive models Unlock deeper insights into machine learning with this vital guide to cutting-edge predictive analytics Who This Book Is For Entry-level analysts who want to enter in the data science world will find this course very useful to get themselves acquainted with Python's data science capabilities for doing real-world data analysis. What You Will Learn Install and setup Python Implement objects in Python by creating classes and defining methods Get acquainted with NumPy to use it with arrays and array-oriented computing in data analysis Create effective visualizations for presenting your data using Matplotlib Process and analyze data using the time series capabilities of pandas Interact with different kind of database systems, such as file, disk format, Mongo, and Redis Apply data mining concepts to real-world problems Compute on big data, including real-time data from the Internet Explore how to use different machine learning models to ask different questions of your data In Detail The Python: Real-World Data Science course will take you on a journey to become an efficient data science practitioner by thoroughly understanding the key concepts of Python. This learning path is divided into four modules and each module are a mini course in their own right, and as you complete each one, you'll have gained key skills and be ready for the material in the next module. The course begins with getting your Python fundamentals nailed down. After getting familiar with Python core concepts, it's time that you dive into the field of data science. In the second module, you'll learn how to perform data analysis using Python in a practical and example-driven way. The third module will teach you how to design and develop data mining applications using a variety of datasets, starting with basic classification and affinity analysis to more complex data types including text, images, and graphs. Machine learning and predictive analytics have become the most important approaches to uncover data gold mines. In the final module, we'll discuss the necessary details regarding machine learning concepts, offering intuitive yet informative explanations on how machine learning algorithms work, how to use them, and most importantly, how to avoid the common pitfalls. Style and approach This course includes all the resources that will help you jump into the data science field with Python and learn how to make sense of data. The aim is to create a smooth learning path that will teach you how to get started with powerful Python libraries and perform various data science techniques in depth.

Understanding and Applying Basic Statistical Methods Using R

Features a straightforward and concise resource for introductory statistical concepts, methods, and techniques using R Understanding and Applying Basic Statistical Methods Using R uniquely bridges the gap between advances in the statistical literature and methods routinely used by non-statisticians. Providing a conceptual basis for understanding the relative merits and applications of these methods, the book features modern insights and advances relevant to basic techniques in terms of dealing with non-normality, outliers, heteroscedasticity (unequal variances), and curvature. Featuring a guide to R, the book uses R programming to explore introductory statistical concepts and standard methods for dealing with known problems associated with classic techniques. Thoroughly class-room tested, the book includes sections that focus on either R programming or computational details to help the reader become acquainted with basic concepts and principles essential in terms of understanding and applying the many methods currently available. Covering relevant material from a wide range of disciplines, Understanding and Applying Basic Statistical Methods Using R also includes: Numerous illustrations and exercises that use data to demonstrate the practical importance of multiple perspectives Discussions on common mistakes such as eliminating outliers and applying standard methods based on means using the remaining data Detailed coverage on R programming with descriptions on how to apply both classic and more modern methods using R A companion website with the data and solutions to all of the exercises Understanding and Applying Basic Statistical Methods Using R is an ideal textbook for an undergraduate and graduate-level statistics courses in the science and/or social science departments. The book can also serve as a reference for professional statisticians and other practitioners looking to better understand modern statistical methods as well as R programming.

Learning Pentaho CTools

Learning Pentaho CTools is a comprehensive guide to building sophisticated and custom analytics dashboards using the powerful capabilities of Pentaho CTools. This book walks you through the process of creating interactive dashboards, integrating data sources, and applying data visualization best practices. You'll quickly gain the expertise needed to create impactful dashboards with ease. What this Book will help me do Master installing and configuring CTools for Pentaho to jumpstart dashboard development. Harness diverse data sources and deliver data in formats like CSV, JSON, and XML for customized analytics. Design and implement dynamic, visually stunning dashboards using Community Dashboard Framework (CDF). Deploy and integrate plugins, leverage widgets, and manage dashboards effectively with version control. Enhance interactivity by customizing dashboard components, charts, and filters to suit unique requirements. Author(s) None Gaspar, an expert in Pentaho and its tools, has been a Senior Consultant at Pentaho, where he gained in-depth experience crafting analytics solutions. He brings to this book his teaching passion and field expertise, combining theoretical insights with practical applications. His approachable style ensures readers can follow technical concepts effectively. Who is it for? This book is ideal for developers who are looking to enhance their understanding of Pentaho's CTools portfolio to build advanced dashboards. A working knowledge of JavaScript and CSS will enable readers to get the most out of this guide. Whether you aim to extend your analytics capabilities or learn the tools from scratch, this book bridges the gap between learning and application.

Network Reliability

In Engineering theory and applications, we think and operate in terms of logics and models with some acceptable and reasonable assumptions. The present text is aimed at providing modelling and analysis techniques for the evaluation of reliability measures (2-terminal, all-terminal, k-terminal reliability) for systems whose structure can be described in the form of a probabilistic graph. Among the several approaches of network reliability evaluation, the multiple-variable-inversion sum-of-disjoint product approach finds a well-deserved niche as it provides the reliability or unreliability expression in a most efficient and compact manner. However, it does require an efficiently enumerated minimal inputs (minimal path, spanning tree, minimal k-trees, minimal cut, minimal global-cut, minimal k-cut) depending on the desired reliability. The present book covers these two aspects in detail through the descriptions of several algorithms devised by the ‘reliability fraternity’ and explained through solved examples to obtain and evaluate 2-terminal, k-terminal and all-terminal network reliability/unreliability measures and could be its USP. The accompanying web-based supplementary information containing modifiable Matlab® source code for the algorithms is another feature of this book. A very concerted effort has been made to keep the book ideally suitable for first course or even for a novice stepping into the area of network reliability. The mathematical treatment is kept as minimal as possible with an assumption on the readers’ side that they have basic knowledge in graph theory, probabilities laws, Boolean laws and set theory.

Cyber-Risk Informatics

This book provides a scientific modeling approach for conducting metrics-based quantitative risk assessments of cybersecurity vulnerabilities and threats. This book provides a scientific modeling approach for conducting metrics-based quantitative risk assessments of cybersecurity threats. The author builds from a common understanding based on previous class-tested works to introduce the reader to the current and newly innovative approaches to address the maliciously-by-human-created (rather than by-chance-occurring) vulnerability and threat, and related cost-effective management to mitigate such risk. This book is purely statistical data-oriented (not deterministic) and employs computationally intensive techniques, such as Monte Carlo and Discrete Event Simulation. The enriched JAVA ready-to-go applications and solutions to exercises provided by the author at the book’s specifically preserved website will enable readers to utilize the course related problems. • Enables the reader to use the book's website's applications to implement and see results, and use them making ‘budgetary’ sense • Utilizes a data analytical approach and provides clear entry points for readers of varying skill sets and backgrounds • Developed out of necessity from real in-class experience while teaching advanced undergraduate and graduate courses by the author Cyber-Risk Informatics is a resource for undergraduate students, graduate students, and practitioners in the field of Risk Assessment and Management regarding Security and Reliability Modeling. Mehmet Sahinoglu, a Professor (1990) Emeritus (2000), is the founder of the Informatics Institute (2009) and its SACS-accredited (2010) and NSA-certified (2013) flagship Cybersystems and Information Security (CSIS) graduate program (the first such full degree in-class program in Southeastern USA) at AUM, Auburn University’s metropolitan campus in Montgomery, Alabama. He is a fellow member of the SDPS Society, a senior member of the IEEE, and an elected member of ISI. Sahinoglu is the recipient of Microsoft's Trustworthy Computing Curriculum (TCC) award and the author of Trustworthy Computing (Wiley, 2007).

Mastering the SAS DS2 Procedure

Enhance your SAS® data wrangling skills with high precision and parallel data manipulation using the new DS2 programming language.

This book addresses the new DS2 programming language from SAS, which combines the precise procedural power and control of the Base SAS DATA step language with the simplicity and flexibility of SQL. DS2 provides simple, safe syntax for performing complex data transformations in parallel and enables manipulation of native database data types at full precision. It also introduces PROC FEDSQL, a modernized SQL language that blends perfectly with DS2. You will learn to harness the power of parallel processing to speed up CPU-intensive computing processes in Base SAS and how to achieve even more speed by processing DS2 programs on massively parallel database systems. Techniques for leveraging Internet APIs to acquire data, avoiding large data movements when working with data from disparate sources, and leveraging DS2’s new data types for full-precision numeric calculations are presented, with examples of why these techniques are essential for the modern data wrangler.

While working through the code samples provided with this book, you will build a library of custom, reusable, and easily shareable DS2 program modules, execute parallelized DATA step programs to speed up a CPU-intensive process, and conduct advanced data transformations using hash objects and matrix math operations.

Threat Forecasting

Drawing upon years of practical experience and using numerous examples and illustrative case studies, Threat Forecasting: Leveraging Big Data for Predictive Analysis discusses important topics, including the danger of using historic data as the basis for predicting future breaches, how to use security intelligence as a tool to develop threat forecasting techniques, and how to use threat data visualization techniques and threat simulation tools. Readers will gain valuable security insights into unstructured big data, along with tactics on how to use the data to their advantage to reduce risk. Presents case studies and actual data to demonstrate threat data visualization techniques and threat simulation tools Explores the usage of kill chain modelling to inform actionable security intelligence Demonstrates a methodology that can be used to create a full threat forecast analysis for enterprise networks of any size

The Evolution of Analytics

Machine learning is a hot topic in business. Even data-driven organizations that have spent years developing successful data analysis platforms, with many accurate statistical models in place, are now looking into this decades-old discipline. But how can companies turn hyped opportunities for machine learning into real business value? This report examines the growing momentum of machine learning in the analytics landscape, the challenges machine learning presents to businesses, and examples of how organizations are actively seeking to incorporate modern machine learning techniques into their production data infrastructures. Authors Patrick Hall, Wen Phan, and Katie Whitson look at two companies in depth—one in healthcare and one in finance—that are seeing the real impact of machine learning. Discover how machine learning can help your organization: Analyze and generate insights from large amounts of varied, messy, and unstructured data unfit for traditional statistical analysis Increase the predictive accuracy beyond what was previously possible Augment aging analytical processes and other decision-making tools

2016 Software Development Salary Survey

Early this year, more than 5000 software engineers, developers, and other programming professionals participated in O’Reilly Media’s first Software Development Salary Survey. Participants included professionals from large and small companies in a variety of industries across 51 countries and all 50 US states. With the complete survey results in this in-depth report, you’ll be able to explore the world of software development—and the careers that propel it—in great detail. With this report, you’ll learn: The top programming languages that respondents currently use professionally Where programmers make the highest salaries—by country and by regions in the US Salary ranges by industry and by specific programming language The difference in earnings between programmers who work on tiny teams vs those work on larger teams The most common programming languages that respondents no longer use in their work The most common languages that respondents intend to learn within the next couple of years Pick up a copy of this report and find out where you stand in the programming world. We encourage you to plug in your own data points to our survey model to see how you compare to other programming professionals in your industry.

Regression Analysis Microsoft® Excel®

This is today’s most complete guide to regression analysis with Microsoft® Excel for any business analytics or research task. Drawing on 25 years of advanced statistical experience, Microsoft MVP Conrad Carlberg shows how to use Excel’s regression-related worksheet functions to perform a wide spectrum of practical analyses. Carlberg clearly explains all the theory you’ll need to avoid mistakes, understand what your regressions are really doing, and evaluate analyses performed by others. From simple correlations and t-tests through multiple analysis of covariance, Carlberg offers hands-on, step-by-step walkthroughs using meaningful examples. He discusses the consequences of using each option and argument, points out idiosyncrasies and controversies associated with Excel’s regression functions, and shows how to use them reliably in fields ranging from medical research to financial analysis to operations. You don’t need expensive software or a doctorate in statistics to work with regression analyses. Microsoft Excel has all the tools you need—and this book has all the knowledge! Understand what regression analysis can and can’t do, and why Master regression-based functions built into all recent versions of Excel Work with correlation and simple regression Make the most of Excel’s improved LINEST() function Plan and perform multiple regression Distinguish the assumptions that matter from the ones that don’t Extend your analysis options by using regression instead of traditional analysis of variance Add covariates to your analysis to reduce bias and increase statistical power

Introducing Data Science

Introducing Data Science teaches you how to accomplish the fundamental tasks that occupy data scientists. Using the Python language and common Python libraries, you'll experience firsthand the challenges of dealing with data at scale and gain a solid foundation in data science. About the Technology Many companies need developers with data science skills to work on projects ranging from social media marketing to machine learning. Discovering what you need to learn to begin a career as a data scientist can seem bewildering. This book is designed to help you get started. About the Book Introducing Data Science explains vital data science concepts and teaches you how to accomplish the fundamental tasks that occupy data scientists. You'll explore data visualization, graph databases, the use of NoSQL, and the data science process. You'll use the Python language and common Python libraries as you experience firsthand the challenges of dealing with data at scale. Discover how Python allows you to gain insights from data sets so big that they need to be stored on multiple machines, or from data moving so quickly that no single machine can handle it. This book gives you hands-on experience with the most popular Python data science libraries, Scikit-learn and StatsModels. After reading this book, you'll have the solid foundation you need to start a career in data science. What's Inside Handling large data Introduction to machine learning Using Python to work with data Writing data science algorithms About the Reader This book assumes you're comfortable reading code in Python or a similar language, such as C, Ruby, or JavaScript. No prior experience with data science is required. About the Authors Davy Cielen, Arno D. B. Meysman, and Mohamed Ali are the founders and managing partners of Optimately and Maiton, where they focus on developing data science projects and solutions in various sectors. Quotes Read this book if you want to get a quick overview of data science, with lots of examples to get you started! - Alvin Raj, Oracle The map that will help you navigate the data science oceans. - Marius Butuc, Shopify Covers the processes involved in data science from end to end… A complete overview. - Heather Campbell, Kainos A must-read for anyone who wants to get into the data science world. - Hector Cuesta, Big Data Bootcamp