talk-data.com talk-data.com

Event

O'Reilly Data Science Books

2013-08-09 – 2026-02-25 Oreilly Visit website ↗

Activities tracked

2093

Collection of O'Reilly books on Data Science.

Filtering by: data ×

Sessions & talks

Showing 1076–1100 of 2093 · Newest first

Search within this event →
Introducing Data Science

Introducing Data Science teaches you how to accomplish the fundamental tasks that occupy data scientists. Using the Python language and common Python libraries, you'll experience firsthand the challenges of dealing with data at scale and gain a solid foundation in data science. About the Technology Many companies need developers with data science skills to work on projects ranging from social media marketing to machine learning. Discovering what you need to learn to begin a career as a data scientist can seem bewildering. This book is designed to help you get started. About the Book Introducing Data Science explains vital data science concepts and teaches you how to accomplish the fundamental tasks that occupy data scientists. You'll explore data visualization, graph databases, the use of NoSQL, and the data science process. You'll use the Python language and common Python libraries as you experience firsthand the challenges of dealing with data at scale. Discover how Python allows you to gain insights from data sets so big that they need to be stored on multiple machines, or from data moving so quickly that no single machine can handle it. This book gives you hands-on experience with the most popular Python data science libraries, Scikit-learn and StatsModels. After reading this book, you'll have the solid foundation you need to start a career in data science. What's Inside Handling large data Introduction to machine learning Using Python to work with data Writing data science algorithms About the Reader This book assumes you're comfortable reading code in Python or a similar language, such as C, Ruby, or JavaScript. No prior experience with data science is required. About the Authors Davy Cielen, Arno D. B. Meysman, and Mohamed Ali are the founders and managing partners of Optimately and Maiton, where they focus on developing data science projects and solutions in various sectors. Quotes Read this book if you want to get a quick overview of data science, with lots of examples to get you started! - Alvin Raj, Oracle The map that will help you navigate the data science oceans. - Marius Butuc, Shopify Covers the processes involved in data science from end to end… A complete overview. - Heather Campbell, Kainos A must-read for anyone who wants to get into the data science world. - Hector Cuesta, Big Data Bootcamp

A Course in Statistics with R

Integrates the theory and applications of statistics using R A Course in Statistics with R has been written to bridge the gap between theory and applications and explain how mathematical expressions are converted into R programs. The book has been primarily designed as a useful companion for a Masters student during each semester of the course, but will also help applied statisticians in revisiting the underpinnings of the subject. With this dual goal in mind, the book begins with R basics and quickly covers visualization and exploratory analysis. Probability and statistical inference, inclusive of classical, nonparametric, and Bayesian schools, is developed with definitions, motivations, mathematical expression and R programs in a way which will help the reader to understand the mathematical development as well as R implementation. Linear regression models, experimental designs, multivariate analysis, and categorical data analysis are treated in a way which makes effective use of visualization techniques and the related statistical techniques underlying them through practical applications, and hence helps the reader to achieve a clear understanding of the associated statistical models. Key features: Integrates R basics with statistical concepts Provides graphical presentations inclusive of mathematical expressions Aids understanding of limit theorems of probability with and without the simulation approach Presents detailed algorithmic development of statistical models from scratch Includes practical applications with over 50 data sets

Regression for Economics, Second Edition

Regression analysis can be used to establish causal relationships between factors and the response variable. However, in order to be able to do so, economic theory must be used to provide the causal relationship and then regression analysis is applied to verify the validity of the theory. Regression analysis is the most commonly used analytical tool and can be understood without complex mathematics.  This book simplifies and demystifies regression analysis. All the examples are from economics and in almost all the cases, real data is used to show the application of the method. By limiting the use of mathematical symbols, the author enables a logical reader to learn regression, without shortchanging the subject.  The book is targeted to all business students and executives who need to understand the concept of regression for practical and professional purposes.

Learning Probabilistic Graphical Models in R

Explore the fundamentals of probabilistic graphical models (PGM) with hands-on examples using R. This book helps you translate theoretical concepts into practical solutions, addressing complex problems with Bayesian and Markov networks. It's written to demystify PGMs, equipping you to create robust models for inference, learning, and prediction. What this Book will help me do Understand and implement probabilistic graphical models, including Bayesian and Markov networks, directly in R. Learn to use various R packages for performing inference and analyzing probabilistic models. Master the essentials of Bayesian methods, transitioning to advanced concepts with clear, step-by-step guidance. Familiarize yourself with methods like PCA and ICA for analyzing and reducing complex data dimensions. Develop practical skills to apply PGM techniques to machine learning challenges and real-world data problems. Author(s) The authors bring diverse expertise in probabilistic modeling, R programming, and applied machine learning. They are passionate educators and technical writers, focusing on breaking down complex theories into accessible knowledge. Their writing emphasizes practical demonstration, leveraging their industry and academic experiences. Who is it for? This book is designed for data scientists, engineers, and machine learning enthusiasts who wish to enhance their understanding of probabilistic graphical models. Whether you're curious about Bayesian methods or looking to apply PGM approaches to data-rich challenges, this guide is perfect for learners at an intermediate level, offering practical insights and real-world applications.

Practical Data Analysis Cookbook

Practical Data Analysis Cookbook takes you on a comprehensive journey to mastering data exploration and analysis using Python. From data cleaning and transformation to building predictive and classification models, this book provides practical recipes for tackling real-world data challenges and extracting valuable insights. What this Book will help me do Efficiently clean, transform, and explore datasets using tools like pandas and OpenRefine. Develop predictive models for time series and other datasets using Python libraries such as scikit-learn and Statsmodels. Apply clustering and classification techniques to real-world data problems to gain actionable insights. Explore advanced topics like natural language processing and graph theory concepts using specialized tools. Build the skills to solve practical data modeling problems encountered in a data science role. Author(s) None Drabas is an experienced data scientist and author who specializes in Python-based data analysis. With a background in tackling intricate data-driven problems, None brings real-world experience to the readers. In creating this Cookbook, None adopts a step-by-step approach, making complex techniques accessible to learners of all backgrounds. Who is it for? If you are a data analyst, data scientist, or someone interested in exploring Python for practical data problems, this book is for you. It suits beginners starting their data journey and intermediate professionals looking to enhance their toolset. With clear instructions, it's ideal for anyone willing to build practical skills and tackle real-world challenges in data analysis.

RStudio for R Statistical Computing Cookbook

Dive into the practical applications of RStudio with this comprehensive cookbook, designed to help analysts and data scientists unlock the full potential of RStudio's features. You'll enhance your statistical computing, data visualization, and reporting skills through over 50 carefully curated recipes-each seamlessly blending conceptual understanding with hands-on implementation. What this Book will help me do Master the latest advanced R console features for a smooth coding experience. Create dynamic and interactive visualizations to effectively represent data insights. Improve R project management to organize and maintain reproducibility in your analyses. Apply statistical and predictive modeling techniques tailored for diverse application domains. Develop interactive web applications and detailed reports with R Markdown and Shiny. Author(s) Andrea Cirillo is an experienced data scientist with a deep knowledge of statistical computing and data analysis. Through his professional and academic career, Andrea has developed a knack for teaching and simplifying complex programming and statistics concepts. His passion is helping others advance their skills with practical, hands-on resources. Who is it for? This book is tailored for data scientists, statisticians, and R programmers with foundational R programming skills. It is ideal for professionals who aim to enhance their fluency with RStudio and improve their statistical analysis capabilities. Whether you're structuring your first analytical project or refining your data visualization techniques, this book is designed to assist your growth. Overall, the audience includes anyone seeking practical expertise in RStudio for impactful data analysis.

NumPy Essentials

NumPy Essentials is your guide to mastering NumPy, the powerful Python library for scientific computing. In this book, you'll discover how to manipulate arrays, perform mathematical operations, and create advanced models. With its clear examples and practical exercises, you'll build the skills needed to efficiently tackle analytical challenges. What this Book will help me do Learn to manipulate data efficiently with NumPy array objects and universal functions. Gain proficiency in solving linear algebra problems using NumPy's powerful modules. Master regression techniques and curve fitting for statistical modeling. Apply Fourier Transform and spectral analysis in solving real-world problems. Integrate and optimize Python code using Cython and the NumPy C API for higher performance. Author(s) Jaidev Deshpande, None Chin, Tanmay Dutta, and Shane Holloway are seasoned developers passionate about Python and scientific computing. With experience across diverse projects, they bring practical insights and accessible explanations to their writing. Who is it for? This book is ideal for Python developers seeking to sharpen their numerical computing skills. Prior experience with Python is expected, as the content progresses quickly to advanced topics. Whether you're working in data analysis, scientific research, or machine learning, this book will provide valuable tools and insights.

Good Charts

Dataviz—the new language of business A good visualization can communicate the nature and potential impact of information and ideas more powerfully than any other form of communication. For a long time “dataviz” was left to specialists—data scientists and professional designers. No longer. A new generation of tools and massive amounts of available data make it easy for anyone to create visualizations that communicate ideas far more effectively than generic spreadsheet charts ever could. What’s more, building good charts is quickly becoming a need-to-have skill for managers. If you’re not doing it, other managers are, and they’re getting noticed for it and getting credit for contributing to your company’s success. In Good Charts, dataviz maven Scott Berinato provides an essential guide to how visualization works and how to use this new language to impress and persuade. Dataviz today is where spreadsheets and word processors were in the early 1980s—on the cusp of changing how we work. Berinato lays out a system for thinking visually and building better charts through a process of talking, sketching, and prototyping. This book is much more than a set of static rules for making visualizations. It taps into both well-established and cutting-edge research in visual perception and neuroscience, as well as the emerging field of visualization science, to explore why good charts (and bad ones) create “feelings behind our eyes.” Along the way, Berinato also includes many engaging vignettes of dataviz pros, illustrating the ideas in practice. Good Charts will help you turn plain, uninspiring charts that merely present information into smart, effective visualizations that powerfully convey ideas.

Age-Period-Cohort Analysis

This book explores the ways in which statistical models, methods, and research designs can be used to open new possibilities for APC analysis. Within a single, consistent HAPC-GLMM statistical modeling framework, the authors synthesize APC models and methods for three research designs: age-by-time period tables of population rates or proportions, repeated cross-section sample surveys, and accelerated longitudinal panel studies. They show how the empirical application of the models to various problems leads to many fascinating findings on how outcome variables develop along the age, period, and cohort dimensions.

Big Data and Business Analytics

With the increasing barrage of big data, it becomes vital for organizations to make sense of this data in a timely and effective way to improve their decision making and competitive advantage. That's where business analytics come into play. This book explores case studies from industry leaders in big data domains such as cybersecurity, marketing, finance, emergency management, healthcare, and transportation. It offers a concise guide for CEOs and senior managers, as well as for business, management, and technology students interested in this emerging field.

Bio-Inspired Computing and Networking

From ant-inspired allocation to a swarm algorithm derived from honeybees, this book explains how the study of biological systems can significantly improve computing, networking, and robotics. Containing contributions from leading researchers from around the world, the book investigates the fundamental aspects and applications of bio-inspired computing and networking. Presenting the latest advances in bio-inspired communication, computing, networking, clustering, optimization, and robotics, the book considers state-of-the art approaches, novel technologies, and experimental studies.

Computational Intelligent Data Analysis for Sustainable Development

Going beyond performing simple analyses, researchers involved in the highly dynamic field of computational intelligent data analysis design algorithms that solve increasingly complex data problems in changing environments, including economic, environmental, and social data. This volume presents novel methodologies for automatically processing these types of data to support rational decision making for sustainable development. Through numerous case studies and applications, it illustrates important data analysis methods, including mathematical optimization, machine learning, signal processing, and temporal and spatial analysis, for quantifying and describing sustainable development problems.

Constrained Principal Component Analysis and Related Techniques

This book shows how constrained principal component analysis (CPCA) offers a unified framework for regression techniques and PCA. Keeping the use of complicated iterative methods to a minimum, the book includes implementation details and many real application examples. It also offers material for methodologically oriented readers interested in developing statistical techniques of their own. MATLAB programs as well as data to create the book's examples are available on the author's website.

Contrast Data Mining

This work collects recent results from this specialized area of data mining that have previously been scattered in the literature, making them more accessible to researchers and developers in data mining and other fields. The book not only presents concepts and techniques for contrast data mining, but also explores the use of contrast mining to solve challenging problems in various scientific, medical, and business domains. It examines how contrast mining is used in discriminative gene transfer and microarray analysis, computational toxicology, spatial and image data classification, network security, and many more applications.

Electromagnetic Waves, Materials, and Computation with MATLAB®

This book is for senior undergraduate/first-year graduate students specializing in one or more of the technologies based on electromagnetics. Composed of three parts, it begins with the electromagnetics of bounded simple media, moves on to electromagnetic equations of complex media, and then covers electromagnetic computation. The author takes a modern approach by using commercial software such as MATLAB and FDTD methods and provides a strong base of conceptual mathematical aspects. The material strikes a balance between theory, intuitive approximate solutions, and the use of commercial software and interpretation of solutions. Case studies and practical examples are presented throughout the text.

Genome Annotation

This thorough overview explores automated genome analysis and annotation from its origins to the challenges of next-generation sequencing data analysis. It explains how current analysis strategies were developed, including sequencing strategies, statistical models, and early annotation systems. The authors then present visualization techniques f

Incomplete Categorical Data Design

A self-contained, systematic introduction, this book shows you how to draw valid statistical inferences from survey data with sensitive characteristics. It guides you in applying the non-randomized response approach in surveys and new non-randomized response designs. The techniques covered integrate the strengths of existing approaches, including randomized response models, incomplete categorical data design, the EM algorithm, the bootstrap method, and the data augmentation algorithm. All R codes for the examples are available online.

Multi-Label Dimensionality Reduction

The data mining and machine learning literature currently lacks a unified treatment of multi-label dimensionality reduction that incorporates both algorithmic developments and applications. Addressing this shortfall, this book covers the methodological developments, theoretical properties, computational aspects, and applications of many multi-label dimensionality reduction algorithms, including existing dimensionality reduction algorithms and new developments of traditional algorithms. It illustrates how to apply the algorithms to solve real-world problems. A supplementary website provides a MATLAB package for implementing popular dimensionality reduction algorithms.

Radar Systems Analysis and Design Using MATLAB, 3rd Edition

Developed from the author's graduate-level courses, the first edition of this book filled the need for a comprehensive, self-contained, and hands-on treatment of radar systems analysis and design. It quickly became a bestseller and was widely adopted by many professors. The second edition built on this successful format by rearranging and updating

RapidMiner

Written by leaders in the data mining community, including the developers of the RapidMiner software, this book provides an in-depth introduction to the application of data mining and business analytics techniques and tools in scientific research, medicine, industry, commerce, and diverse other sectors. It presents the most powerful and flexible open source software solutions: RapidMiner and RapidAnalytics. The book and software tools cover all relevant steps of the data mining process. The software and their extensions can be freely downloaded at www.RapidMiner.com.

Signals and Systems

This text employs MATLAB both computationally and pedagogically to provide interactive visual reinforcement of the fundamentals, including the characteristics of signals, operations used on signals, time and frequency domain analyses of systems, continuous-time and discrete-time signals and systems, and more. The book includes hands-on MATLAB modules linked to specific segments of the text to ensure seamless integration between learning and doing. A solutions manual, MATLAB code, figures, presentation slides, and other ancillary materials are available on an author-supported website or with qualifying course adoption.

Simulation of Dynamic Systems with MATLAB and Simulink, 2nd Edition

"… a seminal text covering the simulation design and analysis of a broad variety of systems using two of the most modern software packages available today. … particularly adept [at] enabling students new to the field to gain a thorough understanding of the basics of continuous simulation in a single semester, and [also provides] a more advanced treatment of the subject for researchers and simulation professionals." —From the Foreword by Chris Bauer, PhD, PE, CMSP Continuous-system simulation is an increasingly important tool for optimizing the performance of real-world systems, and a massive transformation has occurred in the application of simulation in fields ranging from engineering and physical sciences to medicine, biology, economics, and applied mathematics. As with most things, simulation is best learned through practice—but explosive growth in the field requires a new learning approach. A response to changes in the field, Simulation of Dynamic Systems with MATLAB® and Simulink®, Second Edition has been extensively updated to help readers build an in-depth and intuitive understanding of basic concepts, mathematical tools, and the common principles of various simulation models for different phenomena. Includes an abundance of case studies, real-world examples, homework problems, and equations to develop a practical understanding of concepts Accomplished experts Harold Klee and Randal Allen take readers through a gradual and natural progression of important topics in simulation, introducing advanced concepts only after they construct complete examples using fundamental methods. Presented exercises incorporate MATLAB® and Simulink®—including access to downloadable M-files and model files—enabling both students and professionals to gain experience with these industry-standard tools and more easily design, implement, and adjust simulation models in their particular field of study. More universities are offering courses—as well as masters and Ph.D programs—in both continuous-time and discrete-time simulation, promoting a new interdisciplinary focus that appeals to undergraduates and beginning graduates from a wide range of fields. Ideal for such courses, this classroom-tested introductory text presents a flexible, multifaceted approach through which simulation can play a prominent role in validating system design and training personnel involved.

Statistical Methods for QTL Mapping

While numerous advanced statistical approaches have recently been developed for quantitative trait loci (QTL) mapping, the methods are scattered throughout the literature. This book brings together many recent statistical techniques that address the data complexity of QTL mapping. It emphasizes the modern statistical methodology for QTL mapping as well as the statistical issues that arise during this process. The book gives the necessary biological background for statisticians without training in genetics and, likewise, covers statistical thinking and principles for geneticists.

Statistics and Data Analysis for Microarrays Using R and Bioconductor, 2nd Edition

Richly illustrated in color, this bestselling text provides a clear and rigorous description of powerful analysis techniques and algorithms for mining and interpreting biological information. Omitting tedious details, heavy formalisms, and cryptic notations, the text takes a hands-on, example-based approach that explains the basics of R and micr

Stochastic Financial Models

Developed from the esteemed author's advanced undergraduate and graduate courses at the University of Cambridge, this text provides a hands-on, sound introduction to mathematical finance. Assuming no prior knowledge of stochastic calculus or measure-theoretic probability, the author includes the relevant mathematical background as well as many exercises with solutions. He first presents the classical topics of utility and the mean-variance approach to portfolio choice. Focusing on derivative pricing, the text then covers the binomial model, the general discrete-time model, Brownian motion, the Black-Scholes model, and various interest-rate models.