talk-data.com talk-data.com

Event

O'Reilly Data Science Books

2013-08-09 – 2026-02-25 Oreilly Visit website ↗

Activities tracked

2118

Collection of O'Reilly books on Data Science.

Sessions & talks

Showing 851–875 of 2118 · Newest first

Search within this event →
Practical Data Science: A Guide to Building the Technology Stack for Turning Data Lakes into Business Assets

Learn how to build a data science technology stack and perform good data science with repeatable methods. You will learn how to turn data lakes into business assets. The data science technology stack demonstrated in Practical Data Science is built from components in general use in the industry. Data scientist Andreas Vermeulen demonstrates in detail how to build and provision a technology stack to yield repeatable results. He shows you how to apply practical methods to extract actionable business knowledge from data lakes consisting of data from a polyglot of data types and dimensions. What You'll Learn Become fluent in the essential concepts and terminology of data science and data engineering Build and use a technology stack that meets industry criteria Master the methods for retrieving actionable business knowledge Coordinate the handling ofpolyglot data types in a data lake for repeatable results Who This Book Is For Data scientists and data engineers who are required to convert data from a data lake into actionable knowledge for their business, and students who aspire to be data scientists and data engineers

R Projects For Dummies

Make the most of R’s extensive toolset R Projects For Dummies offers a unique learn-by-doing approach. You will increase the depth and breadth of your R skillset by completing a wide variety of projects. By using R’s graphics, interactive, and machine learning tools, you’ll learn to apply R’s extensive capabilities in an array of scenarios. The depth of the project experience is unmatched by any other content online or in print. And you just might increase your statistics knowledge along the way, too! R is a free tool, and it’s the basis of a huge amount of work in data science. It's taking the place of costly statistical software that sometimes takes a long time to learn. One reason is that you can use just a few R commands to create sophisticated analyses. Another is that easy-to-learn R graphics enable you make the results of those analyses available to a wide audience. This book will help you sharpen your skills by applying them in the context of projects with R, including dashboards, image processing, data reduction, mapping, and more. Appropriate for R users at all levels Helps R programmers plan and complete their own projects Focuses on R functions and packages Shows how to carry out complex analyses by just entering a few commands If you’re brand new to R or just want to brush up on your skills, R Projects For Dummies will help you complete your projects with ease.

Python Web Scraping Cookbook

Python Web Scraping Cookbook is your comprehensive guide to building efficient and functional web scraping tools using Python. With practical recipes, you'll learn to overcome the challenges of dynamic content, captcha, and irregular web structures while deploying scalable solutions. What this Book will help me do Master the use of Python libraries like BeautifulSoup and Scrapy for scraping data. Perfect techniques for handling JavaScript-heavy sites using Selenium. Learn to overcome web scraping challenges, such as captchas and rate-limiting. Design scalable scraping pipelines with cloud deployment in AWS. Understand web data extraction techniques with XPath, CSS selectors, and more. Author(s) Michael Heydt is a seasoned software engineer and technical author with a focus on data engineering and cloud solutions. Having worked with Python extensively, he brings real-world insights into web scraping. His practical approach simplifies complex concepts. Who is it for? This book is perfect for Python developers and data enthusiasts keen to master web scraping techniques. If you're a programmer with insights into Python scripting and wish to scrape, analyze, and utilize web data efficiently, this book is for you.

SAS Viya

Learn how to access analytics from SAS Cloud Analytic Services (CAS) using Python and the SAS Viya platform. SAS Viya : The Python Perspective is an introduction to using the Python client on the SAS Viya platform. SAS Viya is a high-performance, fault-tolerant analytics architecture that can be deployed on both public and private cloud infrastructures. While SAS Viya can be used by various SAS applications, it also enables you to access analytic methods from SAS, Python, Lua, and Java, as well as through a REST interface using HTTP or HTTPS. This book focuses on the perspective of SAS Viya from Python. SAS Viya is made up of multiple components. The central piece of this ecosystem is SAS Cloud Analytic Services (CAS). CAS is the cloud-based server that all clients communicate with to run analytical methods. The Python client is used to drive the CAS component directly using objects and constructs that are familiar to Python programmers. Some knowledge of Python would be helpful before using this book; however, there is an appendix that covers the features of Python that are used in the CAS Python client. Knowledge of CAS is not required to use this book. However, you will need to have a CAS server set up and running to execute the examples in this book. With this book, you will learn how to: Install the required components for accessing CAS from Python Connect to CAS, load data, and run simple analyses Work with CAS using APIs familiar to Python users Grasp general CAS workflows and advanced features of the CAS Python client SAS Viya : The Python Perspective covers topics that will be useful to beginners as well as experienced CAS users. It includes examples from creating connections to CAS all the way to simple statistics and machine learning, but it is also useful as a desktop reference.

An Introduction to Discrete-Valued Time Series

A much-needed introduction to the field of discrete-valued time series, with a focus on count-data time series Time series analysis is an essential tool in a wide array of fields, including business, economics, computer science, epidemiology, finance, manufacturing and meteorology, to name just a few. Despite growing interest in discrete-valued time series—especially those arising from counting specific objects or events at specified times—most books on time series give short shrift to that increasingly important subject area. This book seeks to rectify that state of affairs by providing a much needed introduction to discrete-valued time series, with particular focus on count-data time series. The main focus of this book is on modeling. Throughout numerous examples are provided illustrating models currently used in discrete-valued time series applications. Statistical process control, including various control charts (such as cumulative sum control charts), and performance evaluation are treated at length. Classic approaches like ARMA models and the Box-Jenkins program are also featured with the basics of these approaches summarized in an Appendix. In addition, data examples, with all relevant R code, are available on a companion website. Provides a balanced presentation of theory and practice, exploring both categorical and integer-valued series Covers common models for time series of counts as well as for categorical time series, and works out their most important stochastic properties Addresses statistical approaches for analyzing discrete-valued time series and illustrates their implementation with numerous data examples Covers classical approaches such as ARMA models, Box-Jenkins program and how to generate functions Includes dataset examples with all necessary R code provided on a companion website An Introduction to Discrete-Valued Time Series is a valuable working resource for researchers and practitioners in a broad range of fields, including statistics, data science, machine learning, and engineering. It will also be of interest to postgraduate students in statistics, mathematics and economics.

Loss Data Analysis

This volume deals with two complementary topics. On one hand the book deals with the problem of determining the the probability distribution of a positive compound random variable, a problem which appears in the banking and insurance industries, in many areas of operational research and in reliability problems in the engineering sciences. On the other hand, the methodology proposed to solve such problems, which is based on an application of the maximum entropy method to invert the Laplace transform of the distributions, can be applied to many other problems. The book contains applications to a large variety of problems, including the problem of dependence of the sample data used to estimate empirically the Laplace transform of the random variable. Contents Introduction Frequency models Individual severity models Some detailed examples Some traditional approaches to the aggregation problem Laplace transforms and fractional moment problems The standard maximum entropy method Extensions of the method of maximum entropy Superresolution in maxentropic Laplace transform inversion Sample data dependence Disentangling frequencies and decompounding losses Computations using the maxentropic density Review of statistical procedures

Market Data Analysis Using JMP

With the powerful interactive and visual functionality of JMP, you can dynamically analyze market data to transform it into actionable and useful information with clear, concise, and insightful reports and displays. Market Data Analysis Using JMP is a unique example-driven book because it has a specific application focus: market data analysis. A working knowledge of JMP will help you turn your market data into vital knowledge that will help you succeed in a highly competitive, fast-moving, and dynamic business world. This book can be used as a stand-alone resource for working professionals, or as a supplement to a business school course in market data research. Anyone who works with market data will benefit from reading and studying this book, then using JMP to apply the dynamic analytical concepts to their market data. After reading this book, you will be able to quickly and effortlessly use JMP to: prepare market data for analysis use and interpret sophisticated statistical methods build choice models estimate regression models to turn data into useful and actionable information Market Data Analysis Using JMP will teach you how to use dynamic graphics to illustrate your market data analysis and explore the vast possibilities that your data can offer!

An Introduction to SAS University Edition

SAS ® OnDemand for Academics is now the primary software choice for learners. SAS OnDemand for Academics is available for free access to SAS for individual learners as well as university educators and students. Access to SAS University Edition will end Aug. 2, 2021; users will no longer be able to download it after Apr. 30, 2021. Get up and running with the SAS University Edition using Ron Cody’s easy-to-follow, step-by-step guide. Aimed at beginners who have downloaded the free SAS University Edition and want to either use the point-and-click interactive environment of SAS Studio, or who want to write their own SAS programs, or both, An Introduction to SAS University Edition, begins by showing you how to obtain the SAS University Edition, and how you can run SAS on a PC or Macintosh computer. The first part of the book shows you how to perform basic tasks, such as producing a report, summarizing data, producing charts and graphs, and using the SAS Studio built-in tasks. The first part also describes how you can perform basic statistical tests using the interactive point-and-click environment. The second part of the book shows you how to write your own SAS programs, and how to use SAS procedures to perform a variety of tasks. This part of the book also explains how to read data from a variety of sources: text files, Excel workbooks, and CSV files. In order to get familiar with the SAS Studio environment, this book also shows you how to access dozens of interesting data sets that are included with the product.

Regression Analysis with R

Dive into the world of regression analysis with this hands-on guide that covers everything you need to know about building effective regression models in R. You'll learn both the theoretical foundations and how to apply them using practical examples and R code. By the end, you'll be equipped to interpret regression results and use them to make meaningful predictions. What this Book will help me do Master the fundamentals of regression analysis, from simple linear to logistic regression. Gain expertise in R programming for implementing regression models and analyzing results. Develop skills in handling missing data, feature engineering, and exploratory data analysis. Understand how to identify, prevent, and address overfitting and underfitting issues in modeling. Apply regression techniques in real-world applications, including classification problems and advanced methods like Bagging and Boosting. Author(s) Giuseppe Ciaburro is an experienced data scientist and author with a passion for making complex technical topics accessible. With expertise in R programming and regression analysis, he has worked extensively in statistical modeling and data exploration. Giuseppe's writing combines clear explanations of theory with hands-on examples, ideal for learners and practitioners alike. Who is it for? This book is perfect for aspiring data scientists and analysts eager to understand and apply regression analysis using R. It's suited for readers with a foundational knowledge of statistics and basic R programming experience. Whether you're delving into data science or aiming to strengthen existing skills, this book offers practical insights to reach your goals.

Interval Finite Element Method with MATLAB

Interval Finite Element Method with MATLAB provides a thorough introduction to an effective way of investigating problems involving uncertainty using computational modeling. The well-known and versatile Finite Element Method (FEM) is combined with the concept of interval uncertainties to develop the Interval Finite Element Method (IFEM). An interval or stochastic environment in parameters and variables is used in place of crisp ones to make the governing equations interval, thereby allowing modeling of the problem. The concept of interval uncertainties is systematically explained. Several examples are explored with IFEM using MATLAB on topics like spring mass, bar, truss and frame. Provides a systematic approach to understanding the interval uncertainties caused by vague or imprecise data Describes the interval finite element method in detail Gives step-by-step instructions for how to use MATLAB code for IFEM Provides a range of examples of IFEM in use, with accompanying MATLAB codes

Complex Network Analysis in Python

Construct, analyze, and visualize networks with networkx, a Python language module. Network analysis is a powerful tool you can apply to a multitude of datasets and situations. Discover how to work with all kinds of networks, including social, product, temporal, spatial, and semantic networks. Convert almost any real-world data into a complex network--such as recommendations on co-using cosmetic products, muddy hedge fund connections, and online friendships. Analyze and visualize the network, and make business decisions based on your analysis. If you're a curious Python programmer, a data scientist, or a CNA specialist interested in mechanizing mundane tasks, you'll increase your productivity exponentially. Complex network analysis used to be done by hand or with non-programmable network analysis tools, but not anymore! You can now automate and program these tasks in Python. Complex networks are collections of connected items, words, concepts, or people. By exploring their structure and individual elements, we can learn about their meaning, evolution, and resilience. Starting with simple networks, convert real-life and synthetic network graphs into networkx data structures. Look at more sophisticated networks and learn more powerful machinery to handle centrality calculation, blockmodeling, and clique and community detection. Get familiar with presentation-quality network visualization tools, both programmable and interactive--such as Gephi, a CNA explorer. Adapt the patterns from the case studies to your problems. Explore big networks with NetworKit, a high-performance networkx substitute. Each part in the book gives you an overview of a class of networks, includes a practical study of networkx functions and techniques, and concludes with case studies from various fields, including social networking, anthropology, marketing, and sports analytics. Combine your CNA and Python programming skills to become a better network analyst, a more accomplished data scientist, and a more versatile programmer. What You Need: You will need a Python 3.x installation with the following additional modules: Pandas (>=0.18), NumPy (>=1.10), matplotlib (>=1.5), networkx (>=1.11), python-louvain (>=0.5), NetworKit (>=3.6), and generalizesimilarity. We recommend using the Anaconda distribution that comes with all these modules, except for python-louvain, NetworKit, and generalizedsimilarity, and works on all major modern operating systems.

Analyzing Baseball Data with R

With its flexible capabilities and open-source platform, R has become a major tool for analyzing detailed, high-quality baseball data. Analyzing Baseball Data with R provides an introduction to R for sabermetricians, baseball enthusiasts, and students interested in exploring the rich sources of baseball data. It equips readers with the necessary skills and software tools to perform all of the analysis steps, from gathering the datasets and entering them in a convenient format to visualizing the data via graphs to performing a statistical analysis. The authors first present an overview of publicly available baseball datasets and a gentle introduction to the type of data structures and exploratory and data management capabilities of R. They also cover the traditional graphics functions in the base package and introduce more sophisticated graphical displays available through the lattice and ggplot2 packages. Much of the book illustrates the use of R through popular sabermetrics topics, including the Pythagorean formula, runs expectancy, career trajectories, simulation of games and seasons, patterns of streaky behavior of players, and fielding measures. Each chapter contains exercises that encourage readers to perform their own analyses using R. All of the datasets and R code used in the text are available online. This book helps readers answer questions about baseball teams, players, and strategy using large, publically available datasets. It offers detailed instructions on downloading the datasets and putting them into formats that simplify data exploration and analysis. Through the book’s various examples, readers will learn about modern sabermetrics and be able to conduct their own baseball analyses.

Practical Big Data Analytics

Practical Big Data Analytics is your ultimate guide to harnessing Big Data technologies for enterprise analytics and machine learning. By leveraging tools like Hadoop, Spark, NoSQL databases, and frameworks such as R, this book equips you with the skills to implement robust data solutions that drive impactful business insights. Gain practical expertise in handling data at scale and uncover the value behind the numbers. What this Book will help me do Master the fundamental concepts of Big Data storage, processing, and analytics. Gain practical skills in using tools like Hadoop, Spark, and NoSQL databases for large-scale data handling. Develop and deploy machine learning models and dashboards with R and R Shiny. Learn strategies for creating cost-efficient and scalable enterprise data analytics solutions. Understand and implement effective approaches to combining Big Data technologies for actionable insights. Author(s) None Dasgupta is an expert in Big Data analytics, statistical methodologies, and enterprise data solutions. With years of experience consulting on enterprise data platforms and working with leading industry technologies, Dasgupta brings a wealth of practical knowledge to help readers navigate and succeed in the field of Big Data. Through this book, Dasgupta shares an accessible and systematic way to learn and apply key Big Data concepts. Who is it for? This book is ideal for professionals eager to delve into Big Data analytics, regardless of their current level of expertise. It accommodates both aspiring analysts and seasoned IT professionals looking to enhance their knowledge in data-driven decision making. Individuals with a technical inclination and a drive to build Big Data architectures will find this book particularly beneficial. No prior knowledge of Big Data is required, although familiarity with programming concepts will enhance the learning experience.

SAS Certification Prep Guide, 4th Edition

Prepare for the SAS Base Programming for SAS 9 exam with the official guide by the SAS Global Certification Program. New and experienced SAS users who want to prepare for the SAS Base Programming for SAS 9 exam will find this guide to be an invaluable, convenient, and comprehensive resource that covers all of the objectives tested on the exam. Now in its fourth edition, the guide has been extensively updated, and revised to streamline explanations. Major topics include importing and exporting raw data files, creating and modifying SAS data sets, and identifying and correcting data syntax and programming logic errors. The chapter quizzes have been thoroughly updated and full solutions are included at the back of the book. In addition, links are provided to the exam objectives, practice exams, and other helpful resources, such as the updated Base SAS glossary and an expanded collection of practice data sets. Content updates are available here.

Statistical Rethinking

Statistical Rethinking: A Bayesian Course with Examples in R and Stan builds readers’ knowledge of and confidence in statistical modeling. Reflecting the need for even minor programming in today’s model-based statistics, the book pushes readers to perform step-by-step calculations that are usually automated. This unique computational approach ensures that readers understand enough of the details to make reasonable choices and interpretations in their own modeling work. The text presents generalized linear multilevel models from a Bayesian perspective, relying on a simple logical interpretation of Bayesian probability and maximum entropy. It covers from the basics of regression to multilevel models. The author also discusses measurement error, missing data, and Gaussian process models for spatial and network autocorrelation. By using complete R code examples throughout, this book provides a practical foundation for performing statistical inference. Designed for both PhD students and seasoned professionals in the natural and social sciences, it prepares them for more advanced or specialized statistical modeling. Web Resource The book is accompanied by an R package (rethinking) that is available on the author’s website and GitHub. The two core functions (map and map2stan) of this package allow a variety of statistical models to be constructed from standard model formulas.

IBM SPSS Modeler Essentials

Learn how to leverage IBM SPSS Modeler for your data mining and predictive analytics needs in this comprehensive guide. With step-by-step instructions, you'll acquire the skills to import, clean, analyze, and model your data using this robust platform. By the end, you'll be equipped to uncover patterns and trends, enabling data-driven decision-making confidently. What this Book will help me do Understand the fundamentals of data mining and the visual programming interface of IBM SPSS Modeler. Prepare, clean, and preprocess data effectively for analysis and modeling. Build robust predictive models such as decision trees using best practices. Evaluate the performance of your analytical models to ensure accuracy and reliability. Export resulting analyses to apply insights to real-world data projects. Author(s) Keith McCormick and Jesus Salcedo are accomplished professionals in data analytics and statistical modeling. With extensive experience in consulting and teaching, they have guided many in mastering IBM SPSS Modeler through both hands-on workshops and written material. Their approachable teaching style and commitment to clarity ensure accessibility for learners. Who is it for? This book is designed for beginner users of IBM SPSS Modeler who wish to gain practical and actionable skills in data analytics. If you're a data enthusiast looking to explore predictive analytics or a professional eager to discover the insights hidden in your organizational data, this book is for you. A basic understanding of data mining concepts is advantageous but not required. This resource will set any novice on the path toward expert-level comprehension and application.

Learning Alteryx

Learning Alteryx introduces you to using the powerful Alteryx platform for self-service analytics, helping you master key features like data preparation and predictive analytics without needing to code. With this book, you'll gain the skills to create workflows that generate actionable insights, empowering your business to make data-driven decisions. What this Book will help me do Master creating and optimizing workflows in Alteryx to address complex analytical problems. Learn how to clean, prepare, and blend data from various sources efficiently. Understand advanced Alteryx expressions for processing large datasets effectively. Develop meaningful reports and visualizations to communicate insights clearly. Leverage predictive analytics capabilities in Alteryx to make informed decisions. Author(s) The authors of Learning Alteryx collectively bring years of expertise in data analytics and business intelligence. Having worked on diverse projects across multiple industries, they understand the challenges faced by data professionals and are skilled in simplifying complex concepts. They focus on providing practical insights and step-by-step guides to empower learners. Who is it for? Learning Alteryx is ideal for professionals aspiring to enhance their data analytics capabilities or explore self-service analytics. It caters to beginners unfamiliar with analytics platforms, as well as intermediate users seeking to deepen their Alteryx knowledge. Readers should have a basic understanding of data analysis principles.

R Programming By Example

"R Programming By Example" serves as an engaging and practical introduction to the R programming language for data analysis and visualization. Through step-by-step examples and comprehensive guides, this book builds your understanding from foundational knowledge to advanced applications in R. You will master programming practices while analyzing real-world scenarios. What this Book will help me do Gain proficiency in leveraging R's versatile features and package ecosystem to tackle data analysis tasks. Learn to create and customize high-quality visualizations, including 3D graphs, for enhanced data presentation. Understand statistical modeling and descriptive analysis techniques for extracting insights from data. Discover efficient programming strategies in R, including code profiling and parallelization, to optimize performance. Acquire the skills to interface R with databases and RESTful APIs for robust data integration. Author(s) The authors, None Trejo Navarro and Omar Trejo Navarro, bring a wealth of experience in statistical programming and data analysis. Having worked extensively with R, they focus on practical and results-driven teaching. They have a passion for making complex topics accessible to learners. Who is it for? This book is aimed at aspiring data scientists, statisticians, or analysts looking to learn R. It is particularly suitable for readers familiar with basic programming concepts and who wish to apply R in practical scenarios. Whether you're analyzing data, building models, or creating visualizations, this book will guide you effectively. If you're eager to advance your R skills through hands-on projects, this is for you.

SciPy Recipes

Dive into the world of scientific computing with 'SciPy Recipes', a practical guide tailored for anyone seeking hands-on experience with the SciPy stack. With over 110 detailed recipes, you'll gain expertise in handling real-world data challenges, from statistical computations to crafting intricate visualizations and beyond. What this Book will help me do Learn to use the SciPy Stack libraries like NumPy, pandas, and matplotlib effectively for scientific computing tasks. Master data wrangling techniques using pandas for efficient data manipulation. Understand the process of creating informative visualizations using matplotlib. Perform advanced statistical and numerical computations with simplicity. Solve real-world problems like numerical analysis and linear algebra using SciPy components. Author(s) None Martins, Ruben Oliva Ramos, and V Kishore Ayyadevara bring years of experience in scientific computing and Python programming to this book. Individually, they have contributed extensively to the implementation of computational tools and systems. Together, they've crafted this book to be both accessible to learners and insightful for practitioners, blending instruction with real-world practical applications. Who is it for? This book is designed for Python developers, data scientists, and analysts eager to venture into scientific computing. If you have a basic understanding of Python and aspire to effectively manipulate and visualize data using the SciPy stack, this book is perfect for you. It's equally beneficial for those who seek practical solutions to complex computational challenges. Begin your journey into scientific computing with this essential guide.

Adaptive Filtering

This book covers the fundamentals of adaptive filtering, with a focus on the least mean square (LMS) adaptive filter. It discusses random variables, stochastic processes, vectors, matrices, determinants, discrete random signals, and probability distributions, while delivering a concise introduction to MATLAB®—complete with problems, computer experiments, and over 110 functions and script files. The text not only addresses the basics of the LMS adaptive filter algorithm but also explores the Wiener filter and its applications, details the steepest descent method, and develops the Newton’s algorithm.

Biological and Medical Sensor Technologies

Edited by a pioneer in the area of advanced semiconductor materials, this book contains contributions from experts who explore the development and use of sensors in biological and medical applications. It covers advanced sensing and communications, modeling of DNA-derivative architecture, and the use of enzyme and quartz crystal microbalance-based biosensors. The book also addresses biosensors in human behavior measurement, sweat rate wearable sensors, and the future of medical imaging, including developments in spatial and spectral resolution of semiconductor detectors. Contributors discuss application of high-resolution CdTe detectors in gamma ray imaging and recent advances in positron emission tomography technology.

Electronically Scanned Arrays MATLAB® Modeling and Simulation

Electronically scanned arrays (ESAs) have become a key technology for sensor electronic systems. MATLAB® provides an excellent framework for ESA design and analysis, and this book is an invaluable resource for those who require simulation analysis tools that provide insight and understanding for ESA design. In addition to covering ESA fundamentals such as pattern synthesis, grating lobes, and instantaneous bandwidth, the text also provides insight into pattern optimization, subarray beamforming, space-based application of ESAs, and ESA reliability modeling. The book provides MATLAB code, giving readers an opportunity to model ESAs and develop an in-depth understanding that other books do not offer.