talk-data.com talk-data.com

Event

O'Reilly Data Science Books

2013-08-09 – 2026-02-25 Oreilly Visit website ↗

Activities tracked

333

Collection of O'Reilly books on Data Science.

Filtering by: AI/ML ×

Sessions & talks

Showing 226–250 of 333 · Newest first

Search within this event →
Learn Python by Building Data Science Applications

Learn Python by Building Data Science Applications takes a hands-on approach to teaching Python programming by guiding you through building engaging real-world data science projects. This book introduces Python's rich ecosystem and equips you with the skills to analyze data, train models, and deploy them as efficient applications. What this Book will help me do Get proficient in Python programming by learning core topics like data structures, loops, and functions. Explore data science libraries such as NumPy, Pandas, and scikit-learn to analyze and process data. Learn to create visualizations with Matplotlib and Altair, simplifying data communication. Build and deploy machine learning models using Python and share them as web services. Understand development practices such as testing, packaging, and continuous integration for professional workflows. Author(s) None Kats and None Katz are seasoned Python developers with years of experience in teaching programming and deploying data science applications. Their expertise spans providing learners with practical knowledge and versatile skills. They combine clear explanations with engaging projects to ensure a rewarding learning experience. Who is it for? This book is ideal for individuals new to programming or data science who want to learn Python through practical projects. Researchers, analysts, and ambitious students with minimal coding background but a keen interest in data analysis and application development will find this book beneficial. It's a perfect choice for anyone eager to explore and leverage Python for real-world solutions.

Business Data Science: Combining Machine Learning and Economics to Optimize, Automate, and Accelerate Business Decisions

Publisher's Note: Products purchased from Third Party sellers are not guaranteed by the publisher for quality, authenticity, or access to any online entitlements included with the product. Use machine learning to understand your customers, frame decisions, and drive value The business analytics world has changed, and Data Scientists are taking over. Business Data Science takes you through the steps of using machine learning to implement best-in-class business data science. Whether you are a business leader with a desire to go deep on data, or an engineer who wants to learn how to apply Machine Learning to business problems, you’ll find the information, insight, and tools you need to flourish in today’s data-driven economy. You’ll learn how to: •Use the key building blocks of Machine Learning: sparse regularization, out-of-sample validation, and latent factor and topic modeling •Understand how use ML tools in real world business problems, where causation matters more that correlation •Solve data science programs by scripting in the R programming language Today’s business landscape is driven by data and constantly shifting. Companies live and die on their ability to make and implement the right decisions quickly and effectively. Business Data Science is about doing data science right. It’s about the exciting things being done around Big Data to run a flourishing business. It’s about the precepts, principals, and best practices that you need know for best-in-class business data science.

Hands-On Data Analysis with Pandas

Hands-On Data Analysis with Pandas provides an intensive dive into mastering the pandas library for data science and analysis using Python. Through a combination of conceptual explanations and practical demonstrations, readers will learn how to manipulate, visualize, and analyze data efficiently. What this Book will help me do Understand and apply the pandas library for efficient data manipulation. Learn to perform data wrangling tasks such as cleaning and reshaping datasets. Create effective visualizations using pandas and libraries like matplotlib and seaborn. Grasp the basics of machine learning and implement solutions with scikit-learn. Develop reusable data analysis scripts and modules in Python. Author(s) Stefanie Molin is a seasoned data scientist and software engineer with extensive experience in Python and data analytics. She specializes in leveraging the latest data science techniques to solve real-world problems. Her engaging and detailed writing draws from her practical expertise, aiming to make complex concepts accessible to all. Who is it for? This book is ideal for data analysts and aspiring data scientists who are at the beginning stages of their careers or looking to enhance their toolset with pandas and Python. It caters to Python developers eager to delve into data analysis workflows. Readers should have some programming knowledge to fully benefit from the examples and exercises.

Data Science with Python and Dask

Dask is a native parallel analytics tool designed to integrate seamlessly with the libraries you’re already using, including Pandas, NumPy, and Scikit-Learn. With Dask you can crunch and work with huge datasets, using the tools you already have. And Data Science with Python and Dask is your guide to using Dask for your data projects without changing the way you work! About the Technology An efficient data pipeline means everything for the success of a data science project. Dask is a flexible library for parallel computing in Python that makes it easy to build intuitive workflows for ingesting and analyzing large, distributed datasets. Dask provides dynamic task scheduling and parallel collections that extend the functionality of NumPy, Pandas, and Scikit-learn, enabling users to scale their code from a single laptop to a cluster of hundreds of machines with ease. About the Book Data Science with Python and Dask teaches you to build scalable projects that can handle massive datasets. After meeting the Dask framework, you’ll analyze data in the NYC Parking Ticket database and use DataFrames to streamline your process. Then, you’ll create machine learning models using Dask-ML, build interactive visualizations, and build clusters using AWS and Docker. What's Inside Working with large, structured and unstructured datasets Visualization with Seaborn and Datashader Implementing your own algorithms Building distributed apps with Dask Distributed Packaging and deploying Dask apps About the Reader For data scientists and developers with experience using Python and the PyData stack. About the Author Jesse Daniel is an experienced Python developer. He taught Python for Data Science at the University of Denver and leads a team of data scientists at a Denver-based media technology company. We interviewed Jesse as a part of our Six Questions series. Check it out here. Quotes The most comprehensive coverage of Dask to date, with real-world examples that made a difference in my daily work. - Al Krinker, United States Patent and Trademark Office An excellent alternative to PySpark for those who are not on a cloud platform. The author introduces Dask in a way that speaks directly to an analyst. - Jeremy Loscheider, Panera Bread A greatly paced introduction to Dask with real-world datasets. - George Thomas, R&D Architecture Manhattan Associates The ultimate resource to quickly get up and running with Dask and parallel processing in Python. - Gustavo Patino, Oakland University William Beaumont School of Medicine

Applied Supervised Learning with R

Applied Supervised Learning with R equips you with the essential knowledge and practical skills to leverage machine learning techniques for solving business problems using R. With this book, you'll gain hands-on experience in implementing various supervised learning models, assessing their performance, and selecting the best-suited method for your objectives. What this Book will help me do Gain expertise in identifying and framing business problems suitable for supervised learning. Acquire skills in data wrangling and visualization using R packages like dplyr and ggplot2. Master techniques for tuning hyperparameters to optimize machine learning models. Understand methods for feature selection and dimensionality reduction to enhance model performance. Learn how to deploy machine learning models to production environments, such as AWS Lambda. Author(s) Karthik Ramasubramanian and Jojo Moolayil are both seasoned data science practitioners and educators who bring a wealth of experience in machine learning and analytics. With a deep understanding of R and its applications in real-world scenarios, they offer practical insights and actionable examples to their readers. Their teaching style focuses on clarity and practical application. Who is it for? This book is ideal for data analysts, data scientists, and data engineers at a beginner to intermediate level who aim to master supervised machine learning with R. Readers should have basic knowledge of statistics, probabilities, and R programming. It is designed for those eager to apply machine learning techniques to real-world problems and improve their decision-making capabilities.

Hands-On Time Series Analysis with R

Dive into the intricacies of time series analysis and forecasting with R in this comprehensive guide. From foundational concepts to practical implementations, this book equips you with the tools and techniques to analyze, understand, and predict time-dependent data. What this Book will help me do Develop insights by visualizing time-series data and identifying patterns. Master statistical time-series concepts including autocorrelation and moving averages. Learn and implement forecasting models like ARIMA and exponential smoothing. Apply machine learning methodologies for advanced time-series predictions. Work with key R packages for cleaning, manipulating, and analyzing time-series data. Author(s) Rami Krispin is an accomplished statistician and R programmer with extensive experience in data analysis and time-series modeling. His hands-on approach in utilizing R packages and libraries brings clarity to complex time-series concepts. With a passion for teaching and simplifying intricate topics, Rami ensures readers both grasp the theories and apply them effectively. Who is it for? This book is ideal for data analysts, statisticians, and R developers interested in mastering time-series analysis for real-world applications. Designed for readers with a basic understanding of statistics and R programming, it offers a practical approach to learning effective forecasting and data visualization techniques. Professionals aiming to expand their skillset in predictive analytics will find it particularly beneficial.

Graph Algorithms

Learn how graph algorithms can help you leverage relationships within your data to develop intelligent solutions and enhance your machine learning models. With this practical guide,developers and data scientists will discover how graph analytics deliver value, whether they’re used for building dynamic network models or forecasting real-world behavior. Mark Needham and Amy Hodler from Neo4j explain how graph algorithms describe complex structures and reveal difficult-to-find patterns—from finding vulnerabilities and bottlenecksto detecting communities and improving machine learning predictions. You’ll walk through hands-on examples that show you how to use graph algorithms in Apache Spark and Neo4j, two of the most common choices for graph analytics. Learn how graph analytics reveal more predictive elements in today’s data Understand how popular graph algorithms work and how they’re applied Use sample code and tips from more than 20 graph algorithm examples Learn which algorithms to use for different types of questions Explore examples with working code and sample datasets for Spark and Neo4j Create an ML workflow for link prediction by combining Neo4j and Spark

Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud

This is the eBook of the printed book and may not include any media, website access codes, or print supplements that may come packaged with the bound book. For introductory-level Python programming and/or data-science courses. A groundbreaking, flexible approach to computer science and data science The Deitels’ Introduction to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and the Cloud offers a unique approach to teaching introductory Python programming, appropriate for both computer-science and data-science audiences. Providing the most current coverage of topics and applications, the book is paired with extensive traditional supplements as well as Jupyter Notebooks supplements. Real-world datasets and artificial-intelligence technologies allow students to work on projects making a difference in business, industry, government and academia. Hundreds of examples, exercises, projects (EEPs), and implementation case studies give students an engaging, challenging and entertaining introduction to Python programming and hands-on data science. Related Content Video: Python Fundamentals Live courses: Python Full Throttle with Paul Deitel: A One-Day, Fast-Paced, Code-Intensive Python Presentation Python® Data Science Full Throttle with Paul Deitel: Introductory Artificial Intelligence (AI), Big Data and Cloud Case Studies The book’s modular architecture enables instructors to conveniently adapt the text to a wide range of computer-science and data-science courses offered to audiences drawn from many majors. Computer-science instructors can integrate as much or as little data-science and artificial-intelligence topics as they’d like, and data-science instructors can integrate as much or as little Python as they’d like. The book aligns with the latest ACM/IEEE CS-and-related computing curriculum initiatives and with the Data Science Undergraduate Curriculum Proposal sponsored by the National Science Foundation.

Machine Learning in Production: Developing and Optimizing Data Science Workflows and Applications

The typical data science task in industry starts with an “ask” from the business. But few data scientists have been taught what to do with that ask. This book shows them how to assess it in the context of the business’s goals, reframe it to work optimally for both the data scientist and the employer, and then execute on it. Written by two of the experts who’ve achieved breakthrough optimizations at BuzzFeed, it’s packed with real-world examples that take you from start to finish: from ask to actionable insight. Andrew Kelleher and Adam Kelleher walk you through well-formed, concrete principles for approaching common data science problems, giving you an easy-to-use checklist for effective execution. Using their principles and techniques, you’ll gain deeper understanding of your data, learn how to analyze noise and confounding variables so they don’t compromise your analysis, and save weeks of iterative improvement by planning your projects more effectively upfront. Once you’ve mastered their principles, you’ll put them to work in two realistic, beginning-to-end site optimization tasks. These extended examples come complete with reusable code examples and recommended open-source solutions designed for easy adaptation to your everyday challenges. They will be especially valuable for anyone seeking their first data science job – and everyone who’s found that job and wants to succeed in it.

Data Science Projects with Python

Data Science Projects with Python introduces you to data science and machine learning using Python through practical examples. In this book, you'll learn to analyze, visualize, and model data, applying techniques like logistic regression and random forests. With a case-study method, you'll build confidence implementing insights in real-world scenarios. What this Book will help me do Set up a data science environment with necessary Python libraries such as pandas and scikit-learn. Effectively visualize data insights through Matplotlib and summary statistics. Apply machine learning models including logistic regression and random forests to solve data problems. Identify optimal models through evaluation metrics like k-fold cross-validation. Develop confidence in data preparation and modeling techniques for real-world data challenges. Author(s) Stephen Klosterman is a seasoned data scientist with a keen interest in practical applications of machine learning. He combines a strong academic foundation with real-world experience to craft relatable content. Stephen excels in breaking down complex topics into approachable lessons, helping learners grow their data science expertise step by step. Who is it for? This book is ideal for data analysts, scientists, and business professionals looking to enhance their skills in Python and data science. If you have some experience in Python and a foundational understanding of algebra and statistics, you'll find this book approachable. It offers an excellent gateway to mastering advanced data analysis techniques. Whether you're seeking to explore machine learning or apply data insights, this book supports your growth.

Data Science for Marketing Analytics

Data Science for Marketing Analytics introduces you to leveraging state-of-the-art data science techniques to optimize marketing outcomes. You'll learn how to manipulate and analyze data using Python, create customer segments, and apply machine learning algorithms to predict customer behavior. This book provides a comprehensive, hands-on approach to marketing analytics. What this Book will help me do Learn to use Python libraries like pandas & Matplotlib for data analysis. Understand clustering techniques to create meaningful customer segments. Implement linear regression for predicting customer lifetime value. Explore classification algorithms to model customer preferences. Develop skills to build interactive dashboards for marketing reports. Author(s) None Blanchard, Nona Behera, and Pranshu Bhatnagar are experienced professionals in data science and marketing analytics, with extensive backgrounds in applying machine learning to real-world business applications. They bring a wealth of knowledge and an approachable teaching style to this book, focusing on practical, industry-relevant applications for learners. Who is it for? This book is for developers and marketing professionals looking to advance their analytics skills. It is ideal for individuals with a basic understanding of Python and mathematics who want to explore predictive modeling and segmentation strategies. Readers should have a curiosity for data-driven problem-solving in marketing contexts to benefit most from the content.

Hands-On Data Science for Marketing

The book "Hands-On Data Science for Marketing" equips readers with the tools and insights to optimize their marketing campaigns using data science and machine learning techniques. Using practical examples in Python and R, you will learn how to analyze data, predict customer behavior, and implement effective strategies for better customer engagement and retention. What this Book will help me do Understand marketing KPIs and learn to compute and visualize them in Python and R. Develop the ability to analyze customer behavior and predict potential high-value customers. Master machine learning concepts for customer segmentation and personalized marketing strategies. Improve your skills to forecast customer engagement and lifetime value for more effective planning. Learn the techniques of A/B testing and their application in refining marketing decisions. Author(s) Yoon Hyup Hwang is a seasoned data scientist with a deep interest in the intersection of marketing and technology. With years of expertise in implementing machine learning algorithms in marketing analytics, Yoon brings a unique perspective by blending technical insights with business strategy. As an educator and practitioner, Yoon's approachable style and clear explanations make complex topics accessible for all learners. Who is it for? This book is tailored for marketing professionals looking to enhance their strategies using data science, data enthusiasts eager to apply their skills in marketing, and students or engineers seeking to expand their knowledge in this domain. A basic understanding of Python or R is beneficial, but the book is structured to welcome beginners by covering foundational to advanced concepts in a practical way.

Machine Learning with R Quick Start Guide

Machine Learning with R Quick Start Guide takes you through the foundations of machine learning using the R programming language. Starting with the basics, this book introduces key algorithms and methodologies, offering hands-on examples and applicable machine learning solutions that allow you to extract insights and create predictive models. What this Book will help me do Understand the basics of machine learning and apply them using R 3.5. Learn to clean, prepare, and visualize data with R to ensure robust data analysis. Develop and work with predictive models using various machine learning techniques. Discover advanced topics like Natural Language Processing and neural network training. Implement end-to-end pipeline solutions, from data collection to predictive analytics, in R. Author(s) None Sanz, the author of Machine Learning with R Quick Start Guide, is an expert in data science with years of experience in the field of machine learning and R programming. Known for their accessible and detailed teaching style, the author focuses on providing practical knowledge to empower readers in the real world. Who is it for? This book is ideal for graduate students and professionals, including aspiring data scientists and data analysts, looking to start their journey in machine learning. Readers are expected to have some familiarity with the R programming language but no prior machine learning experience is necessary. With this book, the audience will gain the ability to confidently navigate machine learning concepts and practices.

R Statistics Cookbook

The "R Statistics Cookbook" offers a comprehensive guide to solving statistical problems using R 3.5. Through over 100 practical recipes, you'll learn to perform essential statistical analyses, such as t-tests and regression, while mastering techniques for data modeling, nonparametric methods, and machine learning. This resource is tailored for tackling statistics-centric challenges across industries. What this Book will help me do Confidently use R 3.5 to perform statistical analyses that meet your data needs. Apply various hypothesis testing methods, such as t-tests and ANOVA, effectively. Model and forecast data using time series analysis and mixed-effects modeling. Implement regression techniques, including Bayesian regression, for actionable insights. Leverage robust statistics and the caret package for machine learning applications in R. Author(s) None Juretig, a professional statistician and experienced educator, has an extensive background in applying statistical methods to real-world problems using R. Their writing combines deep technical knowledge with an approachable teaching style, making complex statistical concepts accessible to learners of varying levels. Who is it for? If you're a statistician, data scientist, researcher, or analyst with proficiency in R programming and foundational knowledge of linear algebra, this book is crafted for you. It caters to professionals looking to solidify their statistical knowledge while exploring practical, real-world applications. Whether seeking to apply advanced methods or refine your statistical approaches, this guide provides actionable insights.

Intelligent Data Analysis for Biomedical Applications

Intelligent Data Analysis for Biomedical Applications: Challenges and Solutions presents specialized statistical, pattern recognition, machine learning, data abstraction and visualization tools for the analysis of data and discovery of mechanisms that create data. It provides computational methods and tools for intelligent data analysis, with an emphasis on problem-solving relating to automated data collection, such as computer-based patient records, data warehousing tools, intelligent alarming, effective and efficient monitoring, and more. This book provides useful references for educational institutions, industry professionals, researchers, scientists, engineers and practitioners interested in intelligent data analysis, knowledge discovery, and decision support in databases. Provides the methods and tools necessary for intelligent data analysis and gives solutions to problems resulting from automated data collection Contains an analysis of medical databases to provide diagnostic expert systems Addresses the integration of intelligent data analysis techniques within biomedical information systems

Meta-Analytics

Meta-Analytics: Consensus Approaches and System Patterns for Data Analysis presents an exhaustive set of patterns for data science to use on any machine learning based data analysis task. The book virtually ensures that at least one pattern will lead to better overall system behavior than the use of traditional analytics approaches. The book is ‘meta’ to analytics, covering general analytics in sufficient detail for readers to engage with, and understand, hybrid or meta- approaches. The book has relevance to machine translation, robotics, biological and social sciences, medical and healthcare informatics, economics, business and finance. Inn addition, the analytics within can be applied to predictive algorithms for everyone from police departments to sports analysts. Provides comprehensive and systematic coverage of machine learning-based data analysis tasks Enables rapid progress towards competency in data analysis techniques Gives exhaustive and widely applicable patterns for use by data scientists Covers hybrid or ‘meta’ approaches, along with general analytics Lays out information and practical guidance on data analysis for practitioners working across all sectors

Advanced R Statistical Programming and Data Models: Analysis, Machine Learning, and Visualization

Carry out a variety of advanced statistical analyses including generalized additive models, mixed effects models, multiple imputation, machine learning, and missing data techniques using R. Each chapter starts with conceptual background information about the techniques, includes multiple examples using R to achieve results, and concludes with a case study. Written by Matt and Joshua F. Wiley, Advanced R Statistical Programming and Data Models shows you how to conduct data analysis using the popular R language. You’ll delve into the preconditions or hypothesis for various statistical tests and techniques and work through concrete examples using R for a variety of these next-level analytics. This is a must-have guide and reference on using and programming with the R language. What You’ll Learn Conduct advanced analyses in R including: generalized linear models, generalized additive models, mixedeffects models, machine learning, and parallel processing Carry out regression modeling using R data visualization, linear and advanced regression, additive models, survival / time to event analysis Handle machine learning using R including parallel processing, dimension reduction, and feature selection and classification Address missing data using multiple imputation in R Work on factor analysis, generalized linear mixed models, and modeling intraindividual variability Who This Book Is For Working professionals, researchers, or students who are familiar with R and basic statistical techniques such as linear regression and who want to learn how to use R to perform more advanced analytics. Particularly, researchers and data analysts in the social sciences may benefit from these techniques. Additionally, analysts who need parallel processing to speed up analytics are givenproven code to reduce time to result(s).

Kibana 7 Quick Start Guide

Dive into the world of Kibana 7 with this hands-on guide that simplifies the process of visualizing and analyzing data using Elasticsearch. From fundamental concepts to advanced tools, this book enables you to create intuitive dashboards and leverage powerful machine learning capabilities effectively. Discover how to transform your data into actionable insights with ease. What this Book will help me do Configure Logstash to fetch and process CSV data for visualization. Master creating and managing index patterns within Kibana for efficient data navigation. Effectively apply filters to refine data presentations and insights. Develop and utilize machine learning jobs in Kibana to identify trends and anomalies. Create, customize, and share impactful visualizations and dashboards to drive data-driven decisions. Author(s) None Srivastava is a technical expert in data visualization and Elasticsearch tools, with practical experience implementing and teaching about the Elastic Stack. The author brings a hands-on approach to this book, simplifying complex concepts for ease of understanding. Their expertise ensures that the book serves both as a learning guide and a practical reference. Who is it for? This book is ideal for developers and IT professionals who are either new to Kibana or looking to deepen their understanding of its visualization capabilities. It is suitable for individuals working with the Elastic Stack or seeking to leverage Kibana for data analysis purposes. Even if you are progressing from a novice to an intermediate level, this guide will provide future-proof skills to optimize your workflow.

MATLAB Machine Learning Recipes: A Problem-Solution Approach

Harness the power of MATLAB to resolve a wide range of machine learning challenges. This book provides a series of examples of technologies critical to machine learning. Each example solves a real-world problem. All code in MATLAB Machine Learning Recipes: A Problem-Solution Approach is executable. The toolbox that the code uses provides a complete set of functions needed to implement all aspects of machine learning. Authors Michael Paluszek and Stephanie Thomas show how all of these technologies allow the reader to build sophisticated applications to solve problems with pattern recognition, autonomous driving, expert systems, and much more. What you'll learn: How to write code for machine learning, adaptive control and estimation using MATLAB How these three areas complement each other How these three areas are needed for robust machine learning applications How to use MATLAB graphics and visualization tools for machine learning How to code real world examples in MATLAB for major applications of machine learning in big data Who is this book for: The primary audiences are engineers, data scientists and students wanting a comprehensive and code cookbook rich in examples on machine learning using MATLAB.

Principles of Data Science - Second Edition

Dive into the intricacies of data science with 'Principles of Data Science'. This book takes you on a journey to explore, analyze, and transform data into actionable insights using mathematical models, Python programming, and machine learning concepts. With a clear and engaging style, you will progress from understanding theoretical foundations to implementing advanced techniques in real-world scenarios. What this Book will help me do Master the five critical steps in a practical data science workflow. Clean and prepare raw datasets for accurate machine learning models. Understand and apply statistical models and mathematical principles for data analysis. Build and evaluate predictive models using Python and effective metrics. Create impactful visualizations that clearly convey data insights. Author(s) Sinan Ozdemir is an expert in data science, with a background in developing and teaching advanced courses in machine learning and predictive analytics. With co-authors None Kakade and None Tibaldeschi, they bring years of hands-on experience in data science to this comprehensive guide. Their approach simplifies complex concepts, making them accessible without sacrificing depth, to empower readers to make data-driven decisions confidently. Who is it for? This book is ideal for aspiring data scientists seeking a practical introduction to the field. It's perfect for those with basic math skills looking to apply them to data science or experienced programmers who want to explore the mathematical foundation of data science. A basic understanding of Python programming will be invaluable, but the book builds up core concepts step-by-step, making it accessible to both beginners and experienced professionals.

Numerical Python: Scientific Computing and Data Science Applications with Numpy, SciPy and Matplotlib

Leverage the numerical and mathematical modules in Python and its standard library as well as popular open source numerical Python packages like NumPy, SciPy, FiPy, matplotlib and more. This fully revised edition, updated with the latest details of each package and changes to Jupyter projects, demonstrates how to numerically compute solutions and mathematically model applications in big data, cloud computing, financial engineering, business management and more. Numerical Python, Second Edition, presents many brand-new case study examples of applications in data science and statistics using Python, along with extensions to many previous examples. Each of these demonstrates the power of Python for rapid development and exploratory computing due to its simple and high-level syntax and multiple options for data analysis. After reading this book, readers will be familiar with many computing techniques including array-based and symbolic computing, visualization and numerical file I/O, equation solving, optimization, interpolation and integration, and domain-specific computational problems, such as differential equation solving, data analysis, statistical modeling and machine learning. What You'll Learn Work with vectors and matrices using NumPy Plot and visualize data with Matplotlib Perform data analysis tasks with Pandas and SciPy Review statistical modeling and machine learning with statsmodels and scikit-learn Optimize Python code using Numba and Cython Who This Book Is For Developers who want to understand how to use Python and its related ecosystem for numerical computing.

Bioinformatics with Python Cookbook - Second Edition

"Bioinformatics with Python Cookbook" offers a detailed exploration into the modern approaches to computational biology using the Python programming language. Through hands-on recipes, you will master the practical applications of bioinformatics, enabling you to analyze vast biological data effectively using Python libraries and tools. What this Book will help me do Master processing and analyzing genomic datasets in Python to enable accurate bioinformatics discoveries. Understand and apply next-generation sequencing techniques for advanced biological research. Learn to utilize machine learning approaches such as PCA and decision trees for insightful data analysis in biology. Gain proficiency in using high-performance computing frameworks like Dask and Spark for scalable bioinformatics workflows. Develop capabilities to visually represent biological data interactions and insights for presentation and analysis. Author(s) Tiago Antao is a computational scientist specializing in bioinformatics with extensive experience in Python programming applied to biological sciences. He has worked on numerous bioinformatics projects and has a special interest in using Python to bridge biology and data science. Tiago's approachable writing style ensures that both newcomers and experts benefit from his insights. Who is it for? This book is designed for bioinformatics professionals, researchers, and data scientists who are eager to harness the power of Python programming for their biological data analysis needs. If you are familiar with Python and are looking to tackle intermediate to advanced bioinformatics challenges using practical recipes, this book is ideal for you. It is suitable for those seeking to expand their knowledge in computational biology and data visualization techniques. Whether you are working on next-generation sequencing or population genetics, this resource will guide you effectively.

Hands-On Data Science with R

Dive into "Hands-On Data Science with R" and embark on a journey to master the R language for practical data science applications. This comprehensive guide walks through data manipulation, visualization, and advanced analytics, preparing you to tackle real-world data challenges with confidence. What this Book will help me do Understand how to utilize popular R packages effectively for data science tasks. Learn techniques for cleaning, preprocessing, and exploring datasets. Gain insights into implementing machine learning models in R for predictive analytics. Master the use of advanced visualization tools to extract and communicate insights. Develop expertise in integrating R with big data platforms like Hadoop and Spark. Author(s) This book was written by experts in data science and R including Doug Ortiz and his co-authors. They bring years of industry experience and a desire to teach, presenting complex topics in an approachable manner. Who is it for? Designed for data analysts, statisticians, or programmers with basic R knowledge looking to dive into machine learning and predictive analytics. If you're aiming to enhance your skill set or gain confidence in tackling real-world data problems, this book is an excellent choice.

Ensemble Classification Methods with Applications in R

An essential guide to two burgeoning topics in machine learning – classification trees and ensemble learning Ensemble Classification Methods with Applications in R introduces the concepts and principles of ensemble classifiers methods and includes a review of the most commonly used techniques. This important resource shows how ensemble classification has become an extension of the individual classifiers. The text puts the emphasis on two areas of machine learning: classification trees and ensemble learning. The authors explore ensemble classification methods’ basic characteristics and explain the types of problems that can emerge in its application. Written by a team of noted experts in the field, the text is divided into two main sections. The first section outlines the theoretical underpinnings of the topic and the second section is designed to include examples of practical applications. The book contains a wealth of illustrative cases of business failure prediction, zoology, ecology and others. This vital guide: Offers an important text that has been tested both in the classroom and at tutorials at conferences Contains authoritative information written by leading experts in the field Presents a comprehensive text that can be applied to courses in machine learning, data mining and artificial intelligence Combines in one volume two of the most intriguing topics in machine learning: ensemble learning and classification trees Written for researchers from many fields such as biostatistics, economics, environment, zoology, as well as students of data mining and machine learning, Ensemble Classification Methods with Applications in R puts the focus on two topics in machine learning: classification trees and ensemble learning.

Learning Apache Drill

Get up to speed with Apache Drill, an extensible distributed SQL query engine that reads massive datasets in many popular file formats such as Parquet, JSON, and CSV. Drill reads data in HDFS or in cloud-native storage such as S3 and works with Hive metastores along with distributed databases such as HBase, MongoDB, and relational databases. Drill works everywhere: on your laptop or in your largest cluster. In this practical book, Drill committers Charles Givre and Paul Rogers show analysts and data scientists how to query and analyze raw data using this powerful tool. Data scientists today spend about 80% of their time just gathering and cleaning data. With this book, you’ll learn how Drill helps you analyze data more effectively to drive down time to insight. Use Drill to clean, prepare, and summarize delimited data for further analysis Query file types including logfiles, Parquet, JSON, and other complex formats Query Hadoop, relational databases, MongoDB, and Kafka with standard SQL Connect to Drill programmatically using a variety of languages Use Drill even with challenging or ambiguous file formats Perform sophisticated analysis by extending Drill’s functionality with user-defined functions Facilitate data analysis for network security, image metadata, and machine learning