talk-data.com talk-data.com

Event

O'Reilly Data Science Books

2013-08-09 – 2026-02-25 Oreilly Visit website ↗

Activities tracked

794

Collection of O'Reilly books on Data Science.

Filtering by: data-science-tasks ×

Sessions & talks

Showing 401–425 of 794 · Newest first

Search within this event →
Python Web Scraping

Explore the possibilities of web scraping using Python with this practical guide. The book provides a comprehensive introduction to extracting information from web pages, managing complex scraping scenarios, and utilizing specialized tools such as Scrapy. Whether you're dealing with static pages or interactive web content, this book equips you with the skills to gather and process web data efficiently. What this Book will help me do Gain proficiency in writing Python scripts to extract data from web pages. Learn to build and manage multithreaded crawlers to handle large-scale scraping tasks. Master techniques for interacting with dynamic web content and JavaScript-rendered pages. Understand how to work with web forms, sessions, and tackle challenges like CAPTCHA. Implement practical examples of web scraping using Scrapy for real-world data projects. Author(s) Richard Penman is an experienced software engineer and an expert in Python programming and web development. With years of practical expertise in web crawling and data extraction, Richard shares his extensive knowledge in this field to make complex tasks accessible to developers of all levels. His thoughtful approach aims to empower readers to confidently tackle data challenges on the web. Who is it for? This book is ideal for developers and technical professionals who want to learn effective techniques for web scraping with Python. A basic understanding of programming concepts and experience with Python will help readers get the most out of the practical examples. It's also suitable for advanced learners looking to apply Python skills for automating web data extraction tasks. If you're enthusiastic about turning web data into actionable insights, this guide is for you.

Creating Stunning Dashboards with QlikView

Explore the world of QlikView dashboards with this comprehensive guide that walks you through the entire process of creating effective and visually engaging dashboards for your business needs. From identifying KPIs to rolling out your application, this book provides actionable steps and best practices for delivering data-driven results. What this Book will help me do Define key performance indicators (KPIs) based on business objectives and goals. Design and structure dashboards using best practices in data visualization. Master creating various chart types, including bar, line, pie charts, and advanced visualizations, like heat maps. Integrate data from multiple sources, such as ERP systems and spreadsheets, into a cohesive dashboard. Learn the steps to develop mobile-optimized dashboards for accessibility on the go. Author(s) None Villafuerte, a seasoned expert in data visualization and QlikView development, brings a wealth of experience to this book. With years of hands-on work creating impactful dashboards for various business needs, the author's pragmatic and result-oriented approach provides readers with practical and insightful knowledge. Who is it for? The book is tailored for QlikView developers who already possess a basic understanding of scripting and dashboard layout design. It's ideal for professionals aiming to enhance their design and visualization skills. Additionally, business analysts or managers with a technical inclination could also benefit from its comprehensive approach to creating interactive dashboards. If building effective and appealing dashboards that drive business impact is your goal, this book is for you.

Beginning R: An Introduction to Statistical Programming, Second Edition

Beginning R, Second Edition is a hands-on book showing how to use the R language, write and save R scripts, read in data files, and write custom statistical functions as well as use built in functions. This book shows the use of R in specific cases such as one-way ANOVA analysis, linear and logistic regression, data visualization, parallel processing, bootstrapping, and more. It takes a hands-on, example-based approach incorporating best practices with clear explanations of the statistics being done. It has been completely re-written since the first edition to make use of the latest packages and features in R version 3. R is a powerful open-source language and programming environment for statistics and has become the de facto standard for doing, teaching, and learning computational statistics. R is both an object-oriented language and a functional language that is easy to learn, easy to use, and completely free. A large community of dedicated R users and programmers provides an excellent source of R code, functions, and data sets, with a constantly evolving ecosystem of packages providing new functionality for data analysis. R has also become popular in commercial use at companies such as Microsoft, Google, and Oracle. Your investment in learning R is sure to pay off in the long term as R continues to grow into the go to language for data analysis and research.

Learning Shiny

Have you ever wanted to transform your data analysis in R into interactive, web-based dashboards and applications? "Learning Shiny" is your guide to mastering R's Shiny framework to create dynamic, visual, and engaging web applications. With its step-by-step approach, this book enables you to harness Shiny's features effectively. What this Book will help me do Understand the core principles of R and data processing using tools like apply and lapply, empowering you to handle data programmatically. Learn the Shiny framework fundamentals, including structuring an interactive application using UI and server scripts. Create stunning visualizations and dashboards using libraries like ggplot2 and integrate Shiny seamlessly. Deploy and host Shiny web applications on Linux servers for effective sharing and collaboration. Enhance your applications with JavaScript integrations, using tools like D3.js, for advanced customization. Author(s) Hernan Resnizky is a renowned data scientist and educator with extensive experience in R programming and Shiny application development. Known for his clear teaching style, he has guided numerous professionals in using R for real-world applications. His practical approach ensures readers not only learn techniques but understand how to apply them effectively. Who is it for? "Learning Shiny" is ideal for data scientists looking to showcase their work through interactive web apps and visualizations, and for web developers curious about leveraging the Shiny framework in R. Beginners as well as those with some R experience will find tailored guidance to suit their level. If you aim to expand your toolkit with web-focused R capabilities, this book is for you.

Data Preparation in the Big Data Era

Preparing and cleaning data is notoriously expensive, prone to error, and time consuming: the process accounts for roughly 80% of the total time spent on analysis. As this O’Reilly report points out, enterprises have already invested billions of dollars in big data analytics, so there’s great incentive to modernize methods for cleaning, combining, and transforming data. Author Federico Castanedo, Chief Data Scientist at WiseAthena.com, details best practices for reducing the time it takes to convert raw data into actionable insights. With these tools and techniques in mind, your organization will be well positioned to translate big data into big decisions. Explore the problems organizations face today with traditional prep and integration Define the business questions you want to address before selecting, prepping, and analyzing data Learn new methods for preparing raw data, including date-time and string data Understand how some cleaning actions (like replacing missing values) affect your analysis Examine data curation products: modern approaches that scale Consider your business audience when choosing ways to deliver your analysis

Dashboards for Excel

The book takes a hands-on approach to developing dashboards, from instructing users on advanced Excel techniques to addressing dashboard pitfalls common in the real world. Dashboards for Excel is your key to creating informative, actionable, and interactive dashboards and decision support systems. Throughout the book, the reader is challenged to think about Excel and data analytics differently—that is, to think outside the cell. This book shows you how to create dashboards in Excel quickly and effectively. In this book, you learn how to: Apply data visualization principles for more effective dashboards Employ dynamic charts and tables to create dashboards that are constantly up-to-date and providing fresh information Use understated yet powerful formulas for Excel development Apply advanced Excel techniques mixing formulas and Visual Basic for Applications (VBA) to create interactive dashboards Create dynamic systems for decision support in your organization Avoid common problems in Excel development and dashboard creation Get started with the Excel data model, PowerPivot, and Power Query

Inferential Models

This book introduces the authors' recently developed approach to inference: the inferential model (IM) framework. This logical framework for exact probabilistic inference does not require the user to input prior information. The book covers the foundational motivations for this new approach, the basic theory behind its calibration properties, many important applications, and new directions for research. It explores a new way of thinking compared to existing schools of thought on statistical inference and encourages readers to think carefully about the correct approach to scientific inference.

Introduction to Probability

Developed from celebrated Harvard statistics lectures, Introduction to Probability provides essential language and tools for understanding statistics, randomness, and uncertainty. The book explores a wide variety of applications and examples, ranging from coincidences and paradoxes to Google PageRank and Markov chain Monte Carlo (MCMC). Additional application areas explored include genetics, medicine, computer science, and information theory. The print book version includes a code that provides free access to an eBook version. The authors present the material in an accessible style and motivate concepts using real-world examples. Throughout, they use stories to uncover connections between the fundamental distributions in statistics and conditioning to reduce complicated problems to manageable pieces. The book includes many intuitive explanations, diagrams, and practice problems. Each chapter ends with a section showing how to perform relevant simulations and calculations in R, a free statistical software environment.

Methods and Applications of Longitudinal Data Analysis

Methods and Applications of Longitudinal Data Analysis describes methods for the analysis of longitudinal data in the medical, biological and behavioral sciences. It introduces basic concepts and functions including a variety of regression models, and their practical applications across many areas of research. Statistical procedures featured within the text include: descriptive methods for delineating trends over time linear mixed regression models with both fixed and random effects covariance pattern models on correlated errors generalized estimating equations nonlinear regression models for categorical repeated measurements techniques for analyzing longitudinal data with non-ignorable missing observations Emphasis is given to applications of these methods, using substantial empirical illustrations, designed to help users of statistics better analyze and understand longitudinal data. Methods and Applications of Longitudinal Data Analysis equips both graduate students and professionals to confidently apply longitudinal data analysis to their particular discipline. It also provides a valuable reference source for applied statisticians, demographers and other quantitative methodologists. From novice to professional: this book starts with the introduction of basic models and ends with the description of some of the most advanced models in longitudinal data analysis Enables students to select the correct statistical methods to apply to their longitudinal data and avoid the pitfalls associated with incorrect selection Identifies the limitations of classical repeated measures models and describes newly developed techniques, along with real-world examples.

An Introduction to Probability and Statistics, 3rd Edition

A well-balanced introduction to probability theory and mathematical statistics Featuring updated material, An Introduction to Probability and Statistics, Third Edition remains a solid overview to probability theory and mathematical statistics. Divided intothree parts, the Third Edition begins by presenting the fundamentals and foundationsof probability. The second part addresses statistical inference, and the remainingchapters focus on special topics. An Introduction to Probability and Statistics, Third Edition includes: A new section on regression analysis to include multiple regression, logistic regression, and Poisson regression A reorganized chapter on large sample theory to emphasize the growing role of asymptotic statistics Additional topical coverage on bootstrapping, estimation procedures, and resampling Discussions on invariance, ancillary statistics, conjugate prior distributions, and invariant confidence intervals Over 550 problems and answers to most problems, as well as 350 worked out examples and 200 remarks Numerous figures to further illustrate examples and proofs throughout An Introduction to Probability and Statistics, Third Edition is an ideal reference and resource for scientists and engineers in the fields of statistics, mathematics, physics, industrial management, and engineering. The book is also an excellent text for upper-undergraduate and graduate-level students majoring in probability and statistics.

Fundamentals of Statistical Experimental Design and Analysis

Professionals in all areas - business; government; the physical, life, and social sciences; engineering; medicine, etc. - benefit from using statistical experimental design to better understand their worlds and then use that understanding to improve the products, processes, and programs they are responsible for. This book aims to provide the practitioners of tomorrow with a memorable, easy to read, engaging guide to statistics and experimental design. This book uses examples, drawn from a variety of established texts, and embeds them in a business or scientific context, seasoned with a dash of humor, to emphasize the issues and ideas that led to the experiment and the what-do-we-do-next? steps after the experiment. Graphical data displays are emphasized as means of discovery and communication and formulas are minimized, with a focus on interpreting the results that software produce. The role of subject-matter knowledge, and passion, is also illustrated. The examples do not require specialized knowledge, and the lessons they contain are transferrable to other contexts. Fundamentals of Statistical Experimental Design and Analysis introduces the basic elements of an experimental design, and the basic concepts underlying statistical analyses. Subsequent chapters address the following families of experimental designs: Completely Randomized designs, with single or multiple treatment factors, quantitative or qualitative Randomized Block designs Latin Square designs Split-Unit designs Repeated Measures designs Robust designs Optimal designs Written in an accessible, student-friendly style, this book is suitable for a general audience and particularly for those professionals seeking to improve and apply their understanding of experimental design.

Statistics for Big Data For Dummies

The fast and easy way to make sense of statistics for big data Does the subject of data analysis make you dizzy? You've come to the right place! Statistics For Big Data For Dummies breaks this often-overwhelming subject down into easily digestible parts, offering new and aspiring data analysts the foundation they need to be successful in the field. Inside, you'll find an easy-to-follow introduction to exploratory data analysis, the lowdown on collecting, cleaning, and organizing data, everything you need to know about interpreting data using common software and programming languages, plain-English explanations of how to make sense of data in the real world, and much more. Data has never been easier to come by, and the tools students and professionals need to enter the world of big data are based on applied statistics. While the word "statistics" alone can evoke feelings of anxiety in even the most confident student or professional, it doesn't have to. Written in the familiar and friendly tone that has defined the For Dummies brand for more than twenty years, Statistics For Big Data For Dummies takes the intimidation out of the subject, offering clear explanations and tons of step-by-step instruction to help you make sense of data mining—without losing your cool. Helps you to identify valid, useful, and understandable patterns in data Provides guidance on extracting previously unknown information from large databases Shows you how to discover patterns available in big data Gives you access to the latest tools and techniques for working in big data If you're a student enrolled in a related Applied Statistics course or a professional looking to expand your skillset, Statistics For Big Data For Dummies gives you access to everything you need to succeed.

Bent Functions

Bent Functions: Results and Applications to Cryptography offers a unique survey of the objects of discrete mathematics known as Boolean bent functions. As these maximal, nonlinear Boolean functions and their generalizations have many theoretical and practical applications in combinatorics, coding theory, and cryptography, the text provides a detailed survey of their main results, presenting a systematic overview of their generalizations and applications, and considering open problems in classification and systematization of bent functions. The text is appropriate for novices and advanced researchers, discussing proofs of several results, including the automorphism group of bent functions, the lower bound for the number of bent functions, and more. Provides a detailed survey of bent functions and their main results, presenting a systematic overview of their generalizations and applications Presents a systematic and detailed survey of hundreds of results in the area of highly nonlinear Boolean functions in cryptography Appropriate coverage for students from advanced specialists in cryptography, mathematics, and creators of ciphers

Semialgebraic Statistics and Latent Tree Models

This book explains how to analyze statistical models with hidden (latent) variables. It takes a systematic, geometric approach to studying the semialgebraic structure of latent tree models. The first part of the book introduces key concepts in algebraic statistics, focusing on methods that are helpful in the study of models with hidden variables. The second part illustrates important examples of tree models with hidden variables. The author develops the important concepts of L-cumulants and links latent tree models and various tree spaces.

Recursion Theory

This monograph presents recursion theory from a generalized point of view centered on the computational aspects of definability. A major theme is the study of the structures of degrees arising from two key notions of reducibility, the Turing degrees and the hyperdegrees, using techniques and ideas from recursion theory, hyperarithmetic theory, and descriptive set theory. The emphasis is on the interplay between recursion theory and set theory, anchored on the notion of definability. The monograph covers a number of fundamental results in hyperarithmetic theory as well as some recent results on the structure theory of Turing and hyperdegrees. It also features a chapter on the applications of these investigations to higher randomness.

U Can: Statistics For Dummies

Make studying statistics simple with this easy-to-read resource Wouldn't it be wonderful if studying statistics were easier? With U Can: Statistics I For Dummies, it is! This one-stop resource combines lessons, practical examples, study questions, and online practice problems to provide you with the ultimate guide to help you score higher in your statistics course. Foundational statistics skills are a must for students of many disciplines, and leveraging study materials such as this one to supplement your statistics course can be a life-saver. Because U Can: Statistics I For Dummies contains both the lessons you need to learn and the practice problems you need to put the concepts into action, you'll breeze through your scheduled study time. Statistics is all about collecting and interpreting data, and is applicable in a wide range of subject areas—which translates into its popularity among students studying in diverse programs. So, if you feel a bit unsure in class, rest assured that there is an easy way to help you grasp the nuances of statistics! Understand statistical ideas, techniques, formulas, and calculations Interpret and critique graphs and charts, determine probability, and work with confidence intervals Critique and analyze data from polls and experiments Combine learning and applying your new knowledge with practical examples, practice problems, and expanded online resources U Can: Statistics I For Dummies contains everything you need to score higher in your fundamental statistics course!

Statistical Methods for Drug Safety

This book presents a wide variety of statistical approaches for analyzing pharmacoepidemiologic data. It covers both commonly used techniques, such as proportional reporting ratios for the analysis of spontaneous adverse event reports, and newer approaches, such as the use of marginal structural models for controlling dynamic selection bias in the analysis of large-scale longitudinal observational data. Many real examples from both mental and physical health disorders illustrate the use of the methods.

Web Scraping with Python

Learn web scraping and crawling techniques to access unlimited data from any web source in any format. With this practical guide, you’ll learn how to use Python scripts and web APIs to gather and process data from thousands—or even millions—of web pages at once. Ideal for programmers, security professionals, and web administrators familiar with Python, this book not only teaches basic web scraping mechanics, but also delves into more advanced topics, such as analyzing raw data or using scrapers for frontend website testing. Code samples are available to help you understand the concepts in practice.

Leadership and Women in Statistics

This unique and insightful book examines leadership within the roles of statistician and data scientist from international and diverse perspectives. Featuring contributions from leadership experts and statisticians at various stages on the career jungle gym, the text supplies a greater understanding of leadership within teams, research consulting, and project management. It encourages reflection on leadership behaviors, promoting natural and organizational leadership, identifying existing opportunities to foster creative outputs and develop strong leadership voices, and explaining how to convert a passion for statistical science into visionary, ethical, and transformational leadership.

Mastering Matplotlib

Mastering Matplotlib provides readers with the tools to not just create visualizations but to fully harness the capabilities of the Matplotlib library. You will explore advanced features, work on interactive visualizations, and learn to optimize plots for various platforms and datasets. By the end, you will be adept at using Matplotlib in complex projects involving data analysis and visualization. What this Book will help me do Understand the architecture and internals of Matplotlib to better utilize and extend its features. Develop visually dynamic and interactive plots that update in real-time with changes in the user interface. Leverage third-party libraries to visualize complex datasets and relationships efficiently. Create tailored styling for visualizations, meeting publication and presentation standards. Deploy and integrate Matplotlib-based visualizations into cloud environments and big data workflows seamlessly. Author(s) Duncan M. McGreggor is a seasoned software engineer with years of hands-on experience in data visualization and scientific computing. He specializes in utilizing Matplotlib for dynamic charting and advanced plotting use cases. His approach to writing focuses on empowering readers to apply and integrate visualization solutions in real-world scenarios. Who is it for? This book is ideal for scientists, software engineers, programmers, and students who have a foundational understanding of Matplotlib and are looking to take their skills to an advanced level. If you're aiming to leverage Matplotlib to handle intricate datasets or to create sophisticated visual representations, this book is for you. It caters to learners seeking practical guidance for professional or academic projects. Expand your visualization toolkit with this insightful guide.

ggplot2 Essentials

"ggplot2 Essentials" takes you on a journey to mastering data visualization in R. Through this book, you will explore the full capabilities of the ggplot2 package and how it employs the principles of the grammar of graphics to create meaningful and visually appealing graphs. By reading this book, you will gain practical skills to produce stunning plots for your data analysis projects. What this Book will help me do Understand the core concepts of the grammar of graphics and how ggplot2 implements them. Learn to create a variety of plots using the ggplot2's basic and advanced functionalities. Master techniques for customizing plots, including aesthetics and graphical details. Become proficient in exporting plots in diverse formats and creating publication-ready graphs. Incorporate mapping and overlays into your plots, expanding the frontiers of ggplot2 capabilities. Author(s) Donato Teutonico is an experienced data analyst and programmer with a strong background in R and data visualization. With years of expertise, Donato specializes in creating clear, actionable tutorials for users of statistical software. His approach to teaching emphasizes practical, hands-on examples tailored to empower learners with practical plotting skills. Who is it for? This book is ideal for R programmers who want to harness the power of ggplot2 for data visualization. If you are already familiar with R and want to create more sophisticated and customizable graphics, this resource will enhance your skills. It is suited for intermediate R users aiming to discover ggplot2's features and create polished visual representations of their data. Advanced beginners passionate about data visualization will also appreciate the clear explanations and practical examples.

Predicting the Unpredictable

" If you have trouble estimating cost or schedule for your projects, you are not alone. The question is this: who wants the estimate and why? The definition of estimate is to guess. But too often, the people who want estimates want commitments. Instead of a commitment, you can apply practical and pragmatic approaches to developing estimates and then meet your commitments. You can provide your managers with the information they want and that you can live with. Learn how to use different words for your estimates and how to report an estimate that includes uncertainty. Learn who should and should not estimate. Learn how to update your estimate when you know more about your project. Regain estimation sanity. Learn practical and pragmatic ways to estimate schedule or cost for your projects."

Hands-On Mobile App Testing: A Guide for Mobile Testers and Anyone Involved in the Mobile App Business

The First Complete Guide to Mobile App Testing and Quality Assurance: Start-to-Finish Testing Solutions for Both Android and iOS Today, mobile apps must meet rigorous standards of reliability, usability, security, and performance. However, many mobile developers have limited testing experience, and mobile platforms raise new challenges even for long-time testers. Now, Hands-On Mobile App Testing provides the solution: an end-to-end blueprint for thoroughly testing any iOS or Android mobile app. Reflecting his extensive real-life experience, Daniel Knott offers practical guidance on everything from mobile test planning to automation. He provides expert insights on mobile-centric issues, such as testing sensor inputs, battery usage, and hybrid apps, as well as advice on coping with device and platform fragmentation, and more. If you want top-quality apps as much as your users do, this guide will help you deliver them. You’ll find it invaluable–;whether you’re part of a large development team or you are the team. Learn how to Establish your optimal mobile test and launch strategy Create tests that reflect your customers, data networks, devices, and business models Choose and implement the best Android and iOS testing tools Automate testing while ensuring comprehensive coverage Master both functional and nonfunctional approaches to testing Address mobile’s rapid release cycles Test on emulators, simulators, and actual devices Test native, hybrid, and Web mobile apps Gain value from crowd and cloud testing (and understand their limitations) Test database access and local storage Drive value from testing throughout your app lifecycle Start testing wearables, connected homes/cars, and Internet of Things devices