talk-data.com talk-data.com

Topic

GitHub

version_control collaboration code_hosting

11

tagged

Activity Trend

79 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Science Books ×
Modern Business Analytics

Deriving business value from analytics is a challenging process. Turning data into information requires a business analyst who is adept at multiple technologies including databases, programming tools, and commercial analytics tools. This practical guide shows programmers who understand analysis concepts how to build the skills necessary to achieve business value. Author Deanne Larson, data science practitioner and academic, helps you bridge the technical and business worlds to meet these requirements. You'll focus on developing these skills with R and Python using real-world examples. You'll also learn how to leverage methodologies for successful delivery. Learning methodology combined with open source tools is key to delivering successful business analytics and value. This book shows you how to: Apply business analytics methodologies to achieve successful results Cleanse and transform data using R and Python Use R and Python to complete exploratory data analysis Create predictive models to solve business problems in R and Python Use Python, R, and business analytics tools to handle large volumes of data Commit code to GitHub to collaborate with data engineers and data scientists Measure success in business analytics

The Definitive Guide to KQL: Using Kusto Query Language for operations, defending, and threat hunting

Turn the avalanche of raw data from Azure Data Explorer, Azure Monitor, Microsoft Sentinel, and other Microsoft data platforms into actionable intelligence with KQL (Kusto Query Language). Experts in information security and analysis guide you through what it takes to automate your approach to risk assessment and remediation, speeding up detection time while reducing manual work using KQL. This accessible and practical guidedesigned for a broad range of people with varying experience in KQLwill quickly make KQL second nature for information security. Solve real problems with Kusto Query Language and build your competitive advantage: Learn the fundamentals of KQLwhat it is and where it is used Examine the anatomy of a KQL query Understand why data summation and aggregation is important See examples of data summation, including count, countif, and dcount Learn the benefits of moving from raw data ingestion to a more automated approach for security operations Unlock how to write efficient and effective queries Work with advanced KQL operators, advanced data strings, and multivalued strings Explore KQL for day-to-day admin tasks, performance, and troubleshooting Use KQL across Azure, including app services and function apps Delve into defending and threat hunting using KQL Recognize indicators of compromise and anomaly detection Learn to access and contribute to hunting queries via GitHub and workbooks via Microsoft Entra ID

M-statistics

M-STATISTICS A comprehensive resource providing new statistical methodologies and demonstrating how new approaches work for applications M-statistics introduces a new approach to statistical inference, redesigning the fundamentals of statistics, and improving on the classical methods we already use. This book targets exact optimal statistical inference for a small sample under one methodological umbrella. Two competing approaches are offered: maximum concentration (MC) and mode (MO) statistics combined under one methodological umbrella, which is why the symbolic equation M=MC+MO. M-statistics defines an estimator as the limit point of the MC or MO exact optimal confidence interval when the confidence level approaches zero, the MC and MO estimator, respectively. Neither mean nor variance plays a role in M-statistics theory. Novel statistical methodologies in the form of double-sided unbiased and short confidence intervals and tests apply to major statistical parameters: Exact statistical inference for small sample sizes is illustrated with effect size and coefficient of variation, the rate parameter of the Pareto distribution, two-sample statistical inference for normal variance, and the rate of exponential distributions. M-statistics is illustrated with discrete, binomial, and Poisson distributions. Novel estimators eliminate paradoxes with the classic unbiased estimators when the outcome is zero. Exact optimal statistical inference applies to correlation analysis including Pearson correlation, squared correlation coefficient, and coefficient of determination. New MC and MO estimators along with optimal statistical tests, accompanied by respective power functions, are developed. M-statistics is extended to the multidimensional parameter and illustrated with the simultaneous statistical inference for the mean and standard deviation, shape parameters of the beta distribution, the two-sample binomial distribution, and finally, nonlinear regression. Our new developments are accompanied by respective algorithms and R codes, available at GitHub, and as such readily available for applications. M-statistics is suitable for professionals and students alike. It is highly useful for theoretical statisticians and teachers, researchers, and data science analysts as an alternative to classical and approximate statistical inference.

R Packages, 2nd Edition

Turn your R code into packages that others can easily install and use. With this fully updated edition, developers and data scientists will learn how to bundle reusable R functions, sample data, and documentation together by applying the package development philosophy used by the team that maintains the "tidyverse" suite of packages. In the process, you'll learn how to automate common development tasks using a set of R packages, including devtools, usethis, testthat, and roxygen2. Authors Hadley Wickham and Jennifer Bryan from Posit (formerly known as RStudio) help you create packages quickly, then teach you how to get better over time. You'll be able to focus on what you want your package to do as you progressively develop greater mastery of the structure of a package. With this book, you will: Learn the key components of an R package, including code, documentation, and tests Streamline your development process with devtools and the RStudio IDE Get tips on effective habits such as organizing functions into files Get caught up on important new features in the devtools ecosystem Learn about the art and science of unit testing, using features in the third edition of testthat Turn your existing documentation into a beautiful and user friendly website with pkgdown Gain an appreciation of the benefits of modern code hosting platforms, such as GitHub

R 4 Data Science Quick Reference: A Pocket Guide to APIs, Libraries, and Packages

In this handy, quick reference book you'll be introduced to several R data science packages, with examples of how to use each of them. All concepts will be covered concisely, with many illustrative examples using the following APIs: readr, dibble, forecasts, lubridate, stringr, tidyr, magnittr, dplyr, purrr, ggplot2, modelr, and more. With R 4 Data Science Quick Reference, you'll have the code, APIs, and insights to write data science-based applications in the R programming language. You'll also be able to carry out data analysis. All source code used in the book is freely available on GitHub.. What You'll Learn Implement applicable R 4 programming language specification features Import data with readr Work with categories using forcats, time and dates with lubridate, and strings with stringr Format data using tidyr and then transform that data using magrittr and dplyr Write functions with R for data science, data mining, and analytics-based applications Visualize data with ggplot2 and fit data to models using modelr Who This Book Is For Programmers new to R's data science, data mining, and analytics packages. Some prior coding experience with R in general is recommended.

Advanced R 4 Data Programming and the Cloud: Using PostgreSQL, AWS, and Shiny

Program for data analysis using R and learn practical skills to make your work more efficient. This revised book explores how to automate running code and the creation of reports to share your results, as well as writing functions and packages. It includes key R 4 features such as a new color palette for charts, an enhanced reference counting system, and normalization of matrix and array types where matrix objects now formally inherit from the array class, eliminating inconsistencies. Advanced R 4 Data Programming and the Cloud is not designed to teach advanced R programming nor to teach the theory behind statistical procedures. Rather, it is designed to be a practical guide moving beyond merely using R; it shows you how to program in R to automate tasks. This book will teach you how to manipulate data in modern R structures and includes connecting R to databases such as PostgreSQL, cloud services such as Amazon Web Services (AWS), and digital dashboards such as Shiny. Each chapter also includes a detailed bibliography with references to research articles and other resources that cover relevant conceptual and theoretical topics. What You Will Learn Write and document R functions using R 4 Make an R package and share it via GitHub or privately Add tests to R code to ensure it works as intended Use R to talk directly to databases and do complex data management Run R in the Amazon cloud Deploy a Shiny digital dashboard Generate presentation-ready tables and reports using R Who This Book Is For Working professionals, researchers, and students who are familiar with R and basic statistical techniques such as linear regression and who want to learn how to take their R coding and programming to the next level.

Programming Skills for Data Science: Start Writing Code to Wrangle, Analyze, and Visualize Data with R, First Edition

The Foundational Hands-On Skills You Need to Dive into Data Science “Freeman and Ross have created the definitive resource for new and aspiring data scientists to learn foundational programming skills.” –From the foreword by Jared Lander, series editor Using data science techniques, you can transform raw data into actionable insights for domains ranging from urban planning to precision medicine. brings together all the foundational skills you need to get started, even if you have no programming or data science experience. Programming Skills for Data Science Leading instructors Michael Freeman and Joel Ross guide you through installing and configuring the tools you need to solve professional-level data science problems, including the widely used R language and Git version-control system. They explain how to wrangle your data into a form where it can be easily used, analyzed, and visualized so others can see the patterns you've uncovered. Step by step, you'll master powerful R programming techniques and troubleshooting skills for probing data in new ways, and at larger scales. Freeman and Ross teach through practical examples and exercises that can be combined into complete data science projects. Everything's focused on real-world application, so you can quickly start analyzing your own data and getting answers you can act upon. Learn to Install your complete data science environment, including R and RStudio Manage projects efficiently, from version tracking to documentation Host, manage, and collaborate on data science projects with GitHub Master R language fundamentals: syntax, programming concepts, and data structures Load, format, explore, and restructure data for successful analysis Interact with databases and web APIs Master key principles for visualizing data accurately and intuitively Produce engaging, interactive visualizations with ggplot and other R packages Transform analyses into sharable documents and sites with R Markdown Create interactive web data science applications with Shiny Collaborate smoothly as part of a data science team Register your book for convenient access to downloads, updates, and/or corrections as they become available. See inside book for details.

Statistical Rethinking

Statistical Rethinking: A Bayesian Course with Examples in R and Stan builds readers’ knowledge of and confidence in statistical modeling. Reflecting the need for even minor programming in today’s model-based statistics, the book pushes readers to perform step-by-step calculations that are usually automated. This unique computational approach ensures that readers understand enough of the details to make reasonable choices and interpretations in their own modeling work. The text presents generalized linear multilevel models from a Bayesian perspective, relying on a simple logical interpretation of Bayesian probability and maximum entropy. It covers from the basics of regression to multilevel models. The author also discusses measurement error, missing data, and Gaussian process models for spatial and network autocorrelation. By using complete R code examples throughout, this book provides a practical foundation for performing statistical inference. Designed for both PhD students and seasoned professionals in the natural and social sciences, it prepares them for more advanced or specialized statistical modeling. Web Resource The book is accompanied by an R package (rethinking) that is available on the author’s website and GitHub. The two core functions (map and map2stan) of this package allow a variety of statistical models to be constructed from standard model formulas.

Python for Data Analysis, 2nd Edition

Get complete instructions for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.6, the second edition of this hands-on guide is packed with practical case studies that show you how to solve a broad set of data analysis problems effectively. You’ll learn the latest versions of pandas, NumPy, IPython, and Jupyter in the process. Written by Wes McKinney, the creator of the Python pandas project, this book is a practical, modern introduction to data science tools in Python. It’s ideal for analysts new to Python and for Python programmers new to data science and scientific computing. Data files and related material are available on GitHub. Use the IPython shell and Jupyter notebook for exploratory computing Learn basic and advanced features in NumPy (Numerical Python) Get started with data analysis tools in the pandas library Use flexible tools to load, clean, transform, merge, and reshape data Create informative visualizations with matplotlib Apply the pandas groupby facility to slice, dice, and summarize datasets Analyze and manipulate regular and irregular time series data Learn how to solve real-world data analysis problems with thorough, detailed examples

Advanced R: Data Programming and the Cloud

Program for data analysis using R and learn practical skills to make your work more efficient. This book covers how to automate running code and the creation of reports to share your results, as well as writing functions and packages. Advanced R is not designed to teach advanced R programming nor to teach the theory behind statistical procedures. Rather, it is designed to be a practical guide moving beyond merely using R to programming in R to automate tasks. This book will show you how to manipulate data in modern R structures and includes connecting R to data bases such as SQLite, PostgeSQL, and MongoDB. The book closes with a hands-on section to get R running in the cloud. Each chapter also includes a detailed bibliography with references to research articles and other resources that cover relevant conceptual and theoretical topics. What You Will Learn Write and document R functions Make an R package and share it via GitHub or privately Add tests to R code to insure it works as intended Build packages automatically with GitHub Use R to talk directly to databases and do complex data management Run R in the Amazon cloud Generate presentation-ready tables and reports using R Who This Book Is For Working professionals, researchers, or students who are familiar with R and basic statistical techniques such as linear regression and who want to learn how to take their R coding and programming to the next level.

Mastering RStudio: Develop, Communicate, and Collaborate with R

"Mastering RStudio: Develop, Communicate, and Collaborate with R" is your guide to unlocking the potential of RStudio. You'll learn to use RStudio effectively in your data science projects, covering everything from creating R packages to interactive web apps with Shiny. By the end, you'll fully understand how to use RStudio tools to manage projects and share results effectively. What this Book will help me do Gain a comprehensive understanding of the RStudio interface and workflow optimizations. Effectively communicate data insights with R Markdown, including static and interactive documents. Create impactful data visualizations using R's diverse graphical systems and tools. Develop Shiny web applications to showcase and share analytical results. Learn to collaborate on projects using Git and GitHub, and understand R package development workflows. Author(s) Julian Hillebrand and None Nierhoff are experienced R developers with years of practical expertise in data science and software development. They have a passion for teaching how to utilize RStudio effectively. Their approach to writing combines practical examples with thorough explanations, ensuring readers can readily apply concepts to real-world scenarios. Who is it for? This book is ideal for R programmers and analysts seeking to enhance their workflows using RStudio. Whether you're looking to create professional data visualizations, develop R packages, or implement Shiny web applications, this book provides the tools you need. Suitable for those already familiar with basic R programming and fundamental concepts.