O'Reilly Data Science Books

Data Science with Python and Dask

2019-07-18 O'Reilly Amazon

book

Jesse Daniel

data data-science data-science-tools dask AI/ML Analytics

Dask is a native parallel analytics tool designed to integrate seamlessly with the libraries you’re already using, including Pandas, NumPy, and Scikit-Learn. With Dask you can crunch and work with huge datasets, using the tools you already have. And Data Science with Python and Dask is your guide to using Dask for your data projects without changing the way you work! About the Technology An efficient data pipeline means everything for the success of a data science project. Dask is a flexible library for parallel computing in Python that makes it easy to build intuitive workflows for ingesting and analyzing large, distributed datasets. Dask provides dynamic task scheduling and parallel collections that extend the functionality of NumPy, Pandas, and Scikit-learn, enabling users to scale their code from a single laptop to a cluster of hundreds of machines with ease. About the Book Data Science with Python and Dask teaches you to build scalable projects that can handle massive datasets. After meeting the Dask framework, you’ll analyze data in the NYC Parking Ticket database and use DataFrames to streamline your process. Then, you’ll create machine learning models using Dask-ML, build interactive visualizations, and build clusters using AWS and Docker. What's Inside Working with large, structured and unstructured datasets Visualization with Seaborn and Datashader Implementing your own algorithms Building distributed apps with Dask Distributed Packaging and deploying Dask apps About the Reader For data scientists and developers with experience using Python and the PyData stack. About the Author Jesse Daniel is an experienced Python developer. He taught Python for Data Science at the University of Denver and leads a team of data scientists at a Denver-based media technology company. We interviewed Jesse as a part of our Six Questions series. Check it out here. Quotes The most comprehensive coverage of Dask to date, with real-world examples that made a difference in my daily work. - Al Krinker, United States Patent and Trademark Office An excellent alternative to PySpark for those who are not on a cloud platform. The author introduces Dask in a way that speaks directly to an analyst. - Jeremy Loscheider, Panera Bread A greatly paced introduction to Dask with real-world datasets. - George Thomas, R&D Architecture Manhattan Associates The ultimate resource to quickly get up and running with Dask and parallel processing in Python. - Gustavo Patino, Oakland University William Beaumont School of Medicine

Data Science Strategy For Dummies

2019-07-11 O'Reilly Amazon

book

Ulrika Jägare

data data-science Analytics Big Data Data Science

All the answers to your data science questions Over half of all businesses are using data science to generate insights and value from big data. How are they doing it? Data Science Strategy For Dummies answers all your questions about how to build a data science capability from scratch, starting with the “what” and the “why” of data science and covering what it takes to lead and nurture a top-notch team of data scientists. With this book, you’ll learn how to incorporate data science as a strategic function into any business, large or small. Find solutions to your real-life challenges as you uncover the stories and value hidden within data. Learn exactly what data science is and why it’s important Adopt a data-driven mindset as the foundation to success Understand the processes and common roadblocks behind data science Keep your data science program focused on generating business value Nurture a top-quality data science team In non-technical language, Data Science Strategy For Dummies outlines new perspectives and strategies to effectively lead analytics and data science functions to create real value.

The Care and Feeding of Data Scientists

2019-06-25 O'Reilly Amazon

book

Michelangelo D'Agostino , Katie Malone

data data-science data-science-as-a-profession Agile/Scrum Analytics Data Science

As a discipline, data science is relatively young, but the job of managing data scientists is younger still. Many people undertake this management position without the tools, mentorship, or role models they need to do it well. This report examines the steps necessary to build, manage, sustain, and retain a growing data science team. You’ll learn how data science management is similar to but distinct from other management types. Michelangelo D’Agostino, VP of Data Science and Engineering at ShopRunner, and Katie Malone, Director of Data Science at Civis Analytics, provide concrete tips for balancing and structuring a data science team. The authors provide tips for balancing and structuring a data science team, recruiting and interviewing the best candidates, and keeping them productive and happy once they're in place. In this report, you'll: Explore data scientist archetypes, such as operations and research, that fit your organization Devise a plan to recruit, interview, and hire members for your data science team Retain your hires by providing challenging work and learning opportunities Explore Agile and OKR methodology to determine how your team will work together Provide your team with a career ladder through guidance and mentorship

Principles of Strategic Data Science

2019-06-03 O'Reilly Amazon

book

Peter Prevos

data data-science Data Science Python

"Principles of Strategic Data Science" is your go-to guide for creating measurable value from data through strategic use of tools and techniques. This book takes you through key theoretical foundations, practical tools, and the managerial perspective necessary to succeed in data science. What this Book will help me do Master the five-phase framework for strategic data science. Learn ways to effectively visualize data information. Explore the role and contributions of a data science manager. Gain clear insights into organizational benefits of data science. Understand the ethical and mathematical boundaries of data analysis. Author(s) Peter Prevos is an accomplished engineer and social scientist with extensive expertise in data science applications. He combines technical insights with social science management practices to design effective data strategies. Known for his clear teaching style, Peter helps professionals integrate theory with practical planning. Who is it for? This book is ideal for data scientists and analysts seeking to deepen their strategic understanding of data science. It's well-suited for intermediate professionals looking to gain insights into data-driven decision making. Readers should have basic programming knowledge in Python or R. Novice managers eager to harness data for organizational goals will also find it valuable.

Applied Supervised Learning with R

2019-05-31 O'Reilly Amazon

book

Jojo Moolayil , Karthik Ramasubramanian

data data-science data-science-tools r AI/ML Analytics

Applied Supervised Learning with R equips you with the essential knowledge and practical skills to leverage machine learning techniques for solving business problems using R. With this book, you'll gain hands-on experience in implementing various supervised learning models, assessing their performance, and selecting the best-suited method for your objectives. What this Book will help me do Gain expertise in identifying and framing business problems suitable for supervised learning. Acquire skills in data wrangling and visualization using R packages like dplyr and ggplot2. Master techniques for tuning hyperparameters to optimize machine learning models. Understand methods for feature selection and dimensionality reduction to enhance model performance. Learn how to deploy machine learning models to production environments, such as AWS Lambda. Author(s) Karthik Ramasubramanian and Jojo Moolayil are both seasoned data science practitioners and educators who bring a wealth of experience in machine learning and analytics. With a deep understanding of R and its applications in real-world scenarios, they offer practical insights and actionable examples to their readers. Their teaching style focuses on clarity and practical application. Who is it for? This book is ideal for data analysts, data scientists, and data engineers at a beginner to intermediate level who aim to master supervised machine learning with R. Readers should have basic knowledge of statistics, probabilities, and R programming. It is designed for those eager to apply machine learning techniques to real-world problems and improve their decision-making capabilities.

Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud

2019-05-10 O'Reilly Amazon

book

Paul J. Deitel , Harvey M. Deitel

software-development programming-languages Python AI/ML Big Data Cloud Computing

This is the eBook of the printed book and may not include any media, website access codes, or print supplements that may come packaged with the bound book. For introductory-level Python programming and/or data-science courses. A groundbreaking, flexible approach to computer science and data science The Deitels’ Introduction to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and the Cloud offers a unique approach to teaching introductory Python programming, appropriate for both computer-science and data-science audiences. Providing the most current coverage of topics and applications, the book is paired with extensive traditional supplements as well as Jupyter Notebooks supplements. Real-world datasets and artificial-intelligence technologies allow students to work on projects making a difference in business, industry, government and academia. Hundreds of examples, exercises, projects (EEPs), and implementation case studies give students an engaging, challenging and entertaining introduction to Python programming and hands-on data science. Related Content Video: Python Fundamentals Live courses: Python Full Throttle with Paul Deitel: A One-Day, Fast-Paced, Code-Intensive Python Presentation Python® Data Science Full Throttle with Paul Deitel: Introductory Artificial Intelligence (AI), Big Data and Cloud Case Studies The book’s modular architecture enables instructors to conveniently adapt the text to a wide range of computer-science and data-science courses offered to audiences drawn from many majors. Computer-science instructors can integrate as much or as little data-science and artificial-intelligence topics as they’d like, and data-science instructors can integrate as much or as little Python as they’d like. The book aligns with the latest ACM/IEEE CS-and-related computing curriculum initiatives and with the Data Science Undergraduate Curriculum Proposal sponsored by the National Science Foundation.

Machine Learning in Production: Developing and Optimizing Data Science Workflows and Applications

2019-05-08 O'Reilly Amazon

book

Andrew Kelleher , Adam Kelleher

data data-science AI/ML Data Science

The typical data science task in industry starts with an “ask” from the business. But few data scientists have been taught what to do with that ask. This book shows them how to assess it in the context of the business’s goals, reframe it to work optimally for both the data scientist and the employer, and then execute on it. Written by two of the experts who’ve achieved breakthrough optimizations at BuzzFeed, it’s packed with real-world examples that take you from start to finish: from ask to actionable insight. Andrew Kelleher and Adam Kelleher walk you through well-formed, concrete principles for approaching common data science problems, giving you an easy-to-use checklist for effective execution. Using their principles and techniques, you’ll gain deeper understanding of your data, learn how to analyze noise and confounding variables so they don’t compromise your analysis, and save weeks of iterative improvement by planning your projects more effectively upfront. Once you’ve mastered their principles, you’ll put them to work in two realistic, beginning-to-end site optimization tasks. These extended examples come complete with reusable code examples and recommended open-source solutions designed for easy adaptation to your everyday challenges. They will be especially valuable for anyone seeking their first data science job – and everyone who’s found that job and wants to succeed in it.

D3 for the Impatient

2019-05-02 O'Reilly Amazon

book

Philipp K. Janert

data data-science data-science-tasks data-visualization d3 Data Science

If you’re in a hurry to learn D3.js, the leading JavaScript library for web-based graphics and visualization, this book is for you. Written for technically savvy readers with a background in programming or data science, the book moves quickly, emphasizing unifying concepts and patterns. Anticipating common difficulties, author Philipp K. Janert teaches you how to apply D3 to your own problems. Assuming only a general programming background, but no previous experience with contemporary web development, this book explains supporting technologies such as SVG, HTML5, CSS, and the DOM as needed, making it a convenient one-stop resource for a technical audience. Understand D3 selections, the library’s fundamental organizing principle Learn how to create data-driven documents with data binding Create animated graphs and interactive user interfaces Draw figures with curves, shapes, and colors Use the built-in facilities for heatmaps, tree graphs, and networks Simplify your work by writing your own reusable components

Data Science Projects with Python

2019-04-30 O'Reilly Amazon

book

Stephen Klosterman

software-development programming-languages Python AI/ML Data Science Matplotlib

Data Science Projects with Python introduces you to data science and machine learning using Python through practical examples. In this book, you'll learn to analyze, visualize, and model data, applying techniques like logistic regression and random forests. With a case-study method, you'll build confidence implementing insights in real-world scenarios. What this Book will help me do Set up a data science environment with necessary Python libraries such as pandas and scikit-learn. Effectively visualize data insights through Matplotlib and summary statistics. Apply machine learning models including logistic regression and random forests to solve data problems. Identify optimal models through evaluation metrics like k-fold cross-validation. Develop confidence in data preparation and modeling techniques for real-world data challenges. Author(s) Stephen Klosterman is a seasoned data scientist with a keen interest in practical applications of machine learning. He combines a strong academic foundation with real-world experience to craft relatable content. Stephen excels in breaking down complex topics into approachable lessons, helping learners grow their data science expertise step by step. Who is it for? This book is ideal for data analysts, scientists, and business professionals looking to enhance their skills in Python and data science. If you have some experience in Python and a foundational understanding of algebra and statistics, you'll find this book approachable. It offers an excellent gateway to mastering advanced data analysis techniques. Whether you're seeking to explore machine learning or apply data insights, this book supports your growth.

Learn RStudio IDE: Quick, Effective, and Productive Data Science

2019-04-17 O'Reilly Amazon

book

Matthew Campbell

data data-science CSV Data Science DataViz Git

Discover how to use the popular RStudio IDE as a professional tool that includes code refactoring support, debugging, and Git version control integration. This book gives you a tour of RStudio and shows you how it helps you do exploratory data analysis; build data visualizations with ggplot; and create custom R packages and web-based interactive visualizations with Shiny. In addition, you will cover common data analysis tasks including importing data from diverse sources such as SAS files, CSV files, and JSON. You will map out the features in RStudio so that you will be able to customize RStudio to fit your own style of coding. Finally, you will see how to save a ton of time by adopting best practices and using packages to extend RStudio. Learn RStudio IDE is a quick, no-nonsense tutorial of RStudio that will give you a head start to develop the insights you need in your data science projects. What YouWill Learn Quickly, effectively, and productively use RStudio IDE for building data science applications Install RStudio and program your first Hello World application Adopt the RStudio workflow Make your code reusable using RStudio Use RStudio and Shiny for data visualization projects Debug your code with RStudio Import CSV, SPSS, SAS, JSON, and other data Who This Book Is For Programmers who want to start doing data science, but don’t know what tools to focus on to get up to speed quickly.

Data Science for Business and Decision Making

2019-04-11 O'Reilly Amazon

book

Patricia Belfiore , Luiz Paulo Favero

data data-science Analytics Data Science IBM SPSS

Data Science for Business and Decision Making covers both statistics and operations research while most competing textbooks focus on one or the other. As a result, the book more clearly defines the principles of business analytics for those who want to apply quantitative methods in their work. Its emphasis reflects the importance of regression, optimization and simulation for practitioners of business analytics. Each chapter uses a didactic format that is followed by exercises and answers. Freely-accessible datasets enable students and professionals to work with Excel, Stata Statistical Software®, and IBM SPSS Statistics Software®. Combines statistics and operations research modeling to teach the principles of business analytics Written for students who want to apply statistics, optimization and multivariate modeling to gain competitive advantages in business Shows how powerful software packages, such as SPSS and Stata, can create graphical and numerical outputs

Data Science Using Python and R

2019-04-09 O'Reilly Amazon

book

Chantal D. Larose , Daniel T. Larose

data data-science Analytics Data Science Python

Learn data science by doing data science! Data Science Using Python and R will get you plugged into the world’s two most widespread open-source platforms for data science: Python and R. Data science is hot. Bloomberg called data scientist “the hottest job in America.” Python and R are the top two open-source data science tools in the world. In Data Science Using Python and R, you will learn step-by-step how to produce hands-on solutions to real-world business problems, using state-of-the-art techniques. Data Science Using Python and R is written for the general reader with no previous analytics or programming experience. An entire chapter is dedicated to learning the basics of Python and R. Then, each chapter presents step-by-step instructions and walkthroughs for solving data science problems using Python and R. Those with analytics experience will appreciate having a one-stop shop for learning how to do data science using Python and R. Topics covered include data preparation, exploratory data analysis, preparing to model the data, decision trees, model evaluation, misclassification costs, naïve Bayes classification, neural networks, clustering, regression modeling, dimension reduction, and association rules mining. Further, exciting new topics such as random forests and general linear models are also included. The book emphasizes data-driven error costs to enhance profitability, which avoids the common pitfalls that may cost a company millions of dollars. Data Science Using Python and R provides exercises at the end of every chapter, totaling over 500 exercises in the book. Readers will therefore have plenty of opportunity to test their newfound data science skills and expertise. In the Hands-on Analysis exercises, readers are challenged to solve interesting business problems using real-world data sets.

Data Science for Marketing Analytics

2019-03-30 O'Reilly Amazon

book

Pranshu Bhatnagar , Tommy Blanchard , Debasish Behera

data data-science AI/ML Analytics Data Science Marketing

Data Science for Marketing Analytics introduces you to leveraging state-of-the-art data science techniques to optimize marketing outcomes. You'll learn how to manipulate and analyze data using Python, create customer segments, and apply machine learning algorithms to predict customer behavior. This book provides a comprehensive, hands-on approach to marketing analytics. What this Book will help me do Learn to use Python libraries like pandas & Matplotlib for data analysis. Understand clustering techniques to create meaningful customer segments. Implement linear regression for predicting customer lifetime value. Explore classification algorithms to model customer preferences. Develop skills to build interactive dashboards for marketing reports. Author(s) None Blanchard, Nona Behera, and Pranshu Bhatnagar are experienced professionals in data science and marketing analytics, with extensive backgrounds in applying machine learning to real-world business applications. They bring a wealth of knowledge and an approachable teaching style to this book, focusing on practical, industry-relevant applications for learners. Who is it for? This book is for developers and marketing professionals looking to advance their analytics skills. It is ideal for individuals with a basic understanding of Python and mathematics who want to explore predictive modeling and segmentation strategies. Readers should have a curiosity for data-driven problem-solving in marketing contexts to benefit most from the content.

Hands-On Data Science for Marketing

2019-03-29 O'Reilly Amazon

book

Yoon Hyup Hwang

data data-science AI/ML Analytics Data Science KPI

The book "Hands-On Data Science for Marketing" equips readers with the tools and insights to optimize their marketing campaigns using data science and machine learning techniques. Using practical examples in Python and R, you will learn how to analyze data, predict customer behavior, and implement effective strategies for better customer engagement and retention. What this Book will help me do Understand marketing KPIs and learn to compute and visualize them in Python and R. Develop the ability to analyze customer behavior and predict potential high-value customers. Master machine learning concepts for customer segmentation and personalized marketing strategies. Improve your skills to forecast customer engagement and lifetime value for more effective planning. Learn the techniques of A/B testing and their application in refining marketing decisions. Author(s) Yoon Hyup Hwang is a seasoned data scientist with a deep interest in the intersection of marketing and technology. With years of expertise in implementing machine learning algorithms in marketing analytics, Yoon brings a unique perspective by blending technical insights with business strategy. As an educator and practitioner, Yoon's approachable style and clear explanations make complex topics accessible for all learners. Who is it for? This book is tailored for marketing professionals looking to enhance their strategies using data science, data enthusiasts eager to apply their skills in marketing, and students or engineers seeking to expand their knowledge in this domain. A basic understanding of Python or R is beneficial, but the book is structured to welcome beginners by covering foundational to advanced concepts in a practical way.

Machine Learning with R Quick Start Guide

2019-03-29 O'Reilly Amazon

book

Iván Pastor Sanz

data data-science data-science-tools r AI/ML Analytics

Machine Learning with R Quick Start Guide takes you through the foundations of machine learning using the R programming language. Starting with the basics, this book introduces key algorithms and methodologies, offering hands-on examples and applicable machine learning solutions that allow you to extract insights and create predictive models. What this Book will help me do Understand the basics of machine learning and apply them using R 3.5. Learn to clean, prepare, and visualize data with R to ensure robust data analysis. Develop and work with predictive models using various machine learning techniques. Discover advanced topics like Natural Language Processing and neural network training. Implement end-to-end pipeline solutions, from data collection to predictive analytics, in R. Author(s) None Sanz, the author of Machine Learning with R Quick Start Guide, is an expert in data science with years of experience in the field of machine learning and R programming. Known for their accessible and detailed teaching style, the author focuses on providing practical knowledge to empower readers in the real world. Who is it for? This book is ideal for graduate students and professionals, including aspiring data scientists and data analysts, looking to start their journey in machine learning. Readers are expected to have some familiarity with the R programming language but no prior machine learning experience is necessary. With this book, the audience will gain the ability to confidently navigate machine learning concepts and practices.

Meta-Analytics

2019-03-10 O'Reilly Amazon

book

Steven Simske

data data-science data-science-tasks exploratory-data-analysis AI/ML Analytics

Meta-Analytics: Consensus Approaches and System Patterns for Data Analysis presents an exhaustive set of patterns for data science to use on any machine learning based data analysis task. The book virtually ensures that at least one pattern will lead to better overall system behavior than the use of traditional analytics approaches. The book is ‘meta’ to analytics, covering general analytics in sufficient detail for readers to engage with, and understand, hybrid or meta- approaches. The book has relevance to machine translation, robotics, biological and social sciences, medical and healthcare informatics, economics, business and finance. Inn addition, the analytics within can be applied to predictive algorithms for everyone from police departments to sports analysts. Provides comprehensive and systematic coverage of machine learning-based data analysis tasks Enables rapid progress towards competency in data analysis techniques Gives exhaustive and widely applicable patterns for use by data scientists Covers hybrid or ‘meta’ approaches, along with general analytics Lays out information and practical guidance on data analysis for practitioners working across all sectors

Python for Data Science For Dummies, 2nd Edition

2019-02-27 O'Reilly Amazon

book

John Paul Mueller , Luca Massaron

data data-science Cloud Computing Data Science Python

The fast and easy way to learn Python programming and statistics Python is a general-purpose programming language created in the late 1980s—and named after Monty Python—that's used by thousands of people to do things from testing microchips at Intel, to powering Instagram, to building video games with the PyGame library. Python For Data Science For Dummies is written for people who are new to data analysis, and discusses the basics of Python data analysis programming and statistics. The book also discusses Google Colab, which makes it possible to write Python code in the cloud. Get started with data science and Python Visualize information Wrangle data Learn from data The book provides the statistical background needed to get started in data science programming, including probability, random distributions, hypothesis testing, confidence intervals, and building regression models for prediction.

Hands-On Data Science with the Command Line

2019-01-31 O'Reilly Amazon

book

Chris McCubbin , Raymond Page , Jason Morris

data data-science Bash Data Science

"Hands-On Data Science with the Command Line" introduces the incredible power of command-line tools to simplify and automate data science tasks. Leveraging tools like AWK, Bash, and more, you'll learn not only to handle datasets effectively but also to create efficient data pipelines and visualize data directly from the command line. What this Book will help me do Learn to set up and optimize the command line interface for data science tasks. Master using AWK and similar tools for data processing. Discover strategies for scripting, automation, and managing files efficiently. Understand how to visualize data directly from the command line. Gain fluency in combining tools to create seamless data pipelines. Author(s) The authors, None Morris, None McCubbin, and None Page, are experienced data scientists and technical authors with a passion for teaching complex topics in approachable ways. Their extensive experience using command-line tools for data-related workflows equips them to guide readers step-by-step in mastering these powerful techniques. Who is it for? This book is ideal for data scientists and data analysts seeking to streamline and automate their workflows using command-line tools. If you have basic experience with data science and are curious about incorporating the efficiency of the command line into your work, this guide is perfect for you.

Beyond Spreadsheets with R

2019-01-17 O'Reilly Amazon

book

Jonathan Carroll

data data-science data-science-tools r Data Science

Beyond Spreadsheets with R shows you how to take raw data and transform it for use in computations, tables, graphs, and more. You’ll build on simple programming techniques like loops and conditionals to create your own custom functions. You’ll come away with a toolkit of strategies for analyzing and visualizing data of all sorts using R and RStudio. About the Technology Spreadsheets are powerful tools for many tasks, but if you need to interpret, interrogate, and present data, they can feel like the wrong tools for the task. That’s when R programming is the way to go. The R programming language provides a comfortable environment to properly handle all types of data. And within the open source RStudio development suite, you have at your fingertips easy-to-use ways to simplify complex manipulations and create reproducible processes for analysis and reporting. About the Book With Beyond Spreadsheets with R you’ll learn how to go from raw data to meaningful insights using R and RStudio. Each carefully crafted chapter covers a unique way to wrangle data, from understanding individual values to interacting with complex collections of data, including data you scrape from the web. You’ll build on simple programming techniques like loops and conditionals to create your own custom functions. You’ll come away with a toolkit of strategies for analyzing and visualizing data of all sorts. What's Inside How to start programming with R and RStudio Understanding and implementing important R structures and operators Installing and working with R packages Tidying, refining, and plotting your data About the Reader If you’re comfortable writing formulas in Excel, you’re ready for this book. About the Author Dr Jonathan Carroll is a data science consultant providing R programming services. He holds a PhD in theoretical physics. We interviewed Jonathan as a part of our Six Questions series. Check it out here. Quotes A useful guide to facilitate graduating from spreadsheets to more serious data wrangling with R. - John D. Lewis, DDN An excellent book to help you understand how stored data can be used. - Hilde Van Gysel, Trebol Engineering A great introduction to a data science programming language. Makes you want to learn more! - Jenice Tom, CVS Health Handy to have when your data spreads beyond a spreadsheet. - Danil Mironov, Luxoft Poland

Principles of Data Science - Second Edition

2018-12-26 O'Reilly Amazon

book

Sunil Kakade , Sinan Ozdemir , Marco Tibaldeschi

data data-science AI/ML Analytics Data Science Python

Dive into the intricacies of data science with 'Principles of Data Science'. This book takes you on a journey to explore, analyze, and transform data into actionable insights using mathematical models, Python programming, and machine learning concepts. With a clear and engaging style, you will progress from understanding theoretical foundations to implementing advanced techniques in real-world scenarios. What this Book will help me do Master the five critical steps in a practical data science workflow. Clean and prepare raw datasets for accurate machine learning models. Understand and apply statistical models and mathematical principles for data analysis. Build and evaluate predictive models using Python and effective metrics. Create impactful visualizations that clearly convey data insights. Author(s) Sinan Ozdemir is an expert in data science, with a background in developing and teaching advanced courses in machine learning and predictive analytics. With co-authors None Kakade and None Tibaldeschi, they bring years of hands-on experience in data science to this comprehensive guide. Their approach simplifies complex concepts, making them accessible without sacrificing depth, to empower readers to make data-driven decisions confidently. Who is it for? This book is ideal for aspiring data scientists seeking a practical introduction to the field. It's perfect for those with basic math skills looking to apply them to data science or experienced programmers who want to explore the mathematical foundation of data science. A basic understanding of Python programming will be invaluable, but the book builds up core concepts step-by-step, making it accessible to both beginners and experienced professionals.

Numerical Python: Scientific Computing and Data Science Applications with Numpy, SciPy and Matplotlib

2018-12-24 O'Reilly Amazon

book

Robert Johansson

data data-science data-science-tools NumPy AI/ML Big Data

Leverage the numerical and mathematical modules in Python and its standard library as well as popular open source numerical Python packages like NumPy, SciPy, FiPy, matplotlib and more. This fully revised edition, updated with the latest details of each package and changes to Jupyter projects, demonstrates how to numerically compute solutions and mathematically model applications in big data, cloud computing, financial engineering, business management and more. Numerical Python, Second Edition, presents many brand-new case study examples of applications in data science and statistics using Python, along with extensions to many previous examples. Each of these demonstrates the power of Python for rapid development and exploratory computing due to its simple and high-level syntax and multiple options for data analysis. After reading this book, readers will be familiar with many computing techniques including array-based and symbolic computing, visualization and numerical file I/O, equation solving, optimization, interpolation and integration, and domain-specific computational problems, such as differential equation solving, data analysis, statistical modeling and machine learning. What You'll Learn Work with vectors and matrices using NumPy Plot and visualize data with Matplotlib Perform data analysis tasks with Pandas and SciPy Review statistical modeling and machine learning with statsmodels and scikit-learn Optimize Python code using Numba and Cython Who This Book Is For Developers who want to understand how to use Python and its related ecosystem for numerical computing.

2017 Data Science Salary Survey

2018-12-12 O'Reilly Amazon

book

Brian Suda

data data-science data-science-as-a-profession Cloud Computing Data Science

Get a clear picture of the salaries and bonuses data science professionals around the world receive, as well as the tools and cloud providers they use, the tasks they perform, and how interpersonal ("soft") skills might affect their pay. The fifth edition of O’Reilly’s online Data Science Salary Survey provides complete results from nearly 800 participants from 69 different countries, 42 different US states, and Washington, DC. With five years of data, the survey’s results are consistent enough to reliably identify changes and trends. The survey asked specific questions about industry, team, and company size, but also posed questions such as, "How easy is it to move to another position?" or "What is your next career step?" You can plug in your own data points to the survey model and see how you compare to other data science professionals in your industry. With this report, you’ll learn: Where data scientists make the highest salaries—by country and by US state Tools that respondents most commonly use on the job, and tools that contribute most to salary Activities that contribute to higher earnings How gender and bargaining skills affect salaries when all other factors are equal Salary differences between those using open source tools vs those using proprietary tools How the increase in respondents outside of the US signal a rise in international companies starting and growing data organizations Participate in the 2018 Survey: Spend just 5 to 10 minutes and take the anonymous salary survey here: https://www.oreilly.com/ideas/take-the-data-science-salary-survey.

SAS for Mixed Models

2018-12-12 O'Reilly Amazon

book

Russell D. Wolfinger , George A. Milliken , Elizabeth A. Claassen , Walter W. Stroup

data data-science analytics-platforms SAS Data Science

This book expands coverage of mixed models for non-normal data and mixed-model-based precision and power analysis, including the following topics: Discover the power of mixed models with SAS. Mixed models—now the mainstream vehicle for analyzing most research data—are part of the core curriculum in most master’s degree programs in statistics and data science. In a single volume, this book updates both SAS® for Linear Models, Fourth Edition, and SAS® for Mixed Models, Second Edition, covering the latest capabilities for a variety of applications featuring the SAS GLIMMIX and MIXED procedures. Written for instructors of statistics, graduate students, scientists, statisticians in business or government, and other decision makers, SAS® for Mixed Models is the perfect entry for those with a background in two-way analysis of variance, regression, and intermediate-level use of SAS. This book is part of the SAS Press program. Random-effect-only and random-coefficients models Multilevel, split-plot, multilocation, and repeated measures models Hierarchical models with nested random effects Analysis of covariance models Generalized linear mixed models

Bioinformatics with Python Cookbook - Second Edition

2018-11-30 O'Reilly Amazon

book

Tiago Antao

data data-science data-science-domains bioinformatics AI/ML Data Science

"Bioinformatics with Python Cookbook" offers a detailed exploration into the modern approaches to computational biology using the Python programming language. Through hands-on recipes, you will master the practical applications of bioinformatics, enabling you to analyze vast biological data effectively using Python libraries and tools. What this Book will help me do Master processing and analyzing genomic datasets in Python to enable accurate bioinformatics discoveries. Understand and apply next-generation sequencing techniques for advanced biological research. Learn to utilize machine learning approaches such as PCA and decision trees for insightful data analysis in biology. Gain proficiency in using high-performance computing frameworks like Dask and Spark for scalable bioinformatics workflows. Develop capabilities to visually represent biological data interactions and insights for presentation and analysis. Author(s) Tiago Antao is a computational scientist specializing in bioinformatics with extensive experience in Python programming applied to biological sciences. He has worked on numerous bioinformatics projects and has a special interest in using Python to bridge biology and data science. Tiago's approachable writing style ensures that both newcomers and experts benefit from his insights. Who is it for? This book is designed for bioinformatics professionals, researchers, and data scientists who are eager to harness the power of Python programming for their biological data analysis needs. If you are familiar with Python and are looking to tackle intermediate to advanced bioinformatics challenges using practical recipes, this book is ideal for you. It is suitable for those seeking to expand their knowledge in computational biology and data visualization techniques. Whether you are working on next-generation sequencing or population genetics, this resource will guide you effectively.

Hands-On Data Science with R

2018-11-30 O'Reilly Amazon

book

Vitor Bianchi Lanzetta , Nataraj Dasgupta , Ricardo Anjoleto Farias , Doug Ortiz

data data-science AI/ML Analytics Big Data Data Science

Dive into "Hands-On Data Science with R" and embark on a journey to master the R language for practical data science applications. This comprehensive guide walks through data manipulation, visualization, and advanced analytics, preparing you to tackle real-world data challenges with confidence. What this Book will help me do Understand how to utilize popular R packages effectively for data science tasks. Learn techniques for cleaning, preprocessing, and exploring datasets. Gain insights into implementing machine learning models in R for predictive analytics. Master the use of advanced visualization tools to extract and communicate insights. Develop expertise in integrating R with big data platforms like Hadoop and Spark. Author(s) This book was written by experts in data science and R including Doug Ortiz and his co-authors. They bring years of industry experience and a desire to teach, presenting complex topics in an approachable manner. Who is it for? Designed for data analysts, statisticians, or programmers with basic R knowledge looking to dive into machine learning and predictive analytics. If you're aiming to enhance your skill set or gain confidence in tackling real-world data problems, this book is an excellent choice.

talk-data.com

O'Reilly Data Science Books

Top Topics

Top Speakers

Data Science with Python and Dask

Data Science Strategy For Dummies

The Care and Feeding of Data Scientists

Principles of Strategic Data Science

Applied Supervised Learning with R

Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud

Machine Learning in Production: Developing and Optimizing Data Science Workflows and Applications

D3 for the Impatient

Data Science Projects with Python

Learn RStudio IDE: Quick, Effective, and Productive Data Science

Data Science for Business and Decision Making

Data Science Using Python and R

Data Science for Marketing Analytics

Hands-On Data Science for Marketing

Machine Learning with R Quick Start Guide

Meta-Analytics

Python for Data Science For Dummies, 2nd Edition

Hands-On Data Science with the Command Line

Beyond Spreadsheets with R

Principles of Data Science - Second Edition

Numerical Python: Scientific Computing and Data Science Applications with Numpy, SciPy and Matplotlib

2017 Data Science Salary Survey

SAS for Mixed Models

Bioinformatics with Python Cookbook - Second Edition

Hands-On Data Science with R