talk-data.com talk-data.com

Event

O'Reilly Data Science Books

2013-08-09 – 2026-02-25 Oreilly Visit website ↗

Activities tracked

324

Collection of O'Reilly books on Data Science.

Filtering by: Data Science ×

Sessions & talks

Showing 51–75 of 324 · Newest first

Search within this event →
Fundamentals of Data Science

Fundamentals of Data Science: Theory and Practice presents basic and advanced concepts in data science along with real-life applications. The book provides students, researchers and professionals at different levels a good understanding of the concepts of data science, machine learning, data mining and analytics. Users will find the authors’ research experiences and achievements in data science applications, along with in-depth discussions on topics that are essential for data science projects, including pre-processing, that is carried out before applying predictive and descriptive data analysis tasks and proximity measures for numeric, categorical and mixed-type data. The book's authors include a systematic presentation of many predictive and descriptive learning algorithms, including recent developments that have successfully handled large datasets with high accuracy. In addition, a number of descriptive learning tasks are included. Presents the foundational concepts of data science along with advanced concepts and real-life applications for applied learning Includes coverage of a number of key topics such as data quality and pre-processing, proximity and validation, predictive data science, descriptive data science, ensemble learning, association rule mining, Big Data analytics, as well as incremental and distributed learning Provides updates on key applications of data science techniques in areas such as Computational Biology, Network Intrusion Detection, Natural Language Processing, Software Clone Detection, Financial Data Analysis, and Scientific Time Series Data Analysis Covers computer program code for implementing descriptive and predictive algorithms

Google Cloud Platform for Data Science: A Crash Course on Big Data, Machine Learning, and Data Analytics Services

This book is your practical and comprehensive guide to learning Google Cloud Platform (GCP) for data science, using only the free tier services offered by the platform. Data science and machine learning are increasingly becoming critical to businesses of all sizes, and the cloud provides a powerful platform for these applications. GCP offers a range of data science services that can be used to store, process, and analyze large datasets, and train and deploy machine learning models. The book is organized into seven chapters covering various topics such as GCP account setup, Google Colaboratory, Big Data and Machine Learning, Data Visualization and Business Intelligence, Data Processing and Transformation, Data Analytics and Storage, and Advanced Topics. Each chapter provides step-by-step instructions and examples illustrating how to use GCP services for data science and big data projects. Readers will learn how to set up a Google Colaboratory account and run Jupyternotebooks, access GCP services and data from Colaboratory, use BigQuery for data analytics, and deploy machine learning models using Vertex AI. The book also covers how to visualize data using Looker Data Studio, run data processing pipelines using Google Cloud Dataflow and Dataprep, and store data using Google Cloud Storage and SQL. What You Will Learn Set up a GCP account and project Explore BigQuery and its use cases, including machine learning Understand Google Cloud AI Platform and its capabilities Use Vertex AI for training and deploying machine learning models Explore Google Cloud Dataproc and its use cases for big data processing Create and share data visualizations and reports with Looker Data Studio Explore Google Cloud Dataflow and its use cases for batch and stream data processing Run data processing pipelines on Cloud Dataflow Explore Google Cloud Storageand its use cases for data storage Get an introduction to Google Cloud SQL and its use cases for relational databases Get an introduction to Google Cloud Pub/Sub and its use cases for real-time data streaming Who This Book Is For Data scientists, machine learning engineers, and analysts who want to learn how to use Google Cloud Platform (GCP) for their data science and big data projects

Data Smart, 2nd Edition

Want to jump into data science but don't know where to start? Let's be real, data science is presented as something mystical and unattainable without the most powerful software, hardware, and data expertise. Real data science isn't about technology. It's about how you approach the problem. In this updated edition of Data Smart: Using Data Science to Transform Information into Insight, award-winning data scientist and bestselling author Jordan Goldmeier shows you how to implement data science problems using Excel while exposing how things work behind the scenes. Data Smart is your field guide to building statistics, machine learning, and powerful artificial intelligence concepts right inside your spreadsheet. Inside you'll find: Four-color data visualizations that highlight and illustrate the concepts discussed in the book Tutorials explaining complicated data science using just Microsoft Excel How to take what you’ve learned and apply it to everyday problems at work and life Advice for using formulas, Power Query, and some of Excel's latest features to solve tough data problems Smart data science solutions for common business challenges Explanations of what algorithms do, how they work, and what you can tweak to take your Excel skills to the next level Data Smart is a must-read for students, analysts, and managers ready to become data science savvy and share their findings with the world.

Python for Data Science For Dummies, 3rd Edition

Let Python do the heavy lifting for you as you analyze large datasets Python for Data Science For Dummies lets you get your hands dirty with data using one of the top programming languages. This beginner’s guide takes you step by step through getting started, performing data analysis, understanding datasets and example code, working with Google Colab, sampling data, and beyond. Coding your data analysis tasks will make your life easier, make you more in-demand as an employee, and open the door to valuable knowledge and insights. This new edition is updated for the latest version of Python and includes current, relevant data examples. Get a firm background in the basics of Python coding for data analysis Learn about data science careers you can pursue with Python coding skills Integrate data analysis with multimedia and graphics Manage and organize data with cloud-based relational databases Python careers are on the rise. Grab this user-friendly Dummies guide and gain the programming skills you need to become a data pro.

Data Science: The Hard Parts

This practical guide provides a collection of techniques and best practices that are generally overlooked in most data engineering and data science pedagogy. A common misconception is that great data scientists are experts in the "big themes" of the discipline—machine learning and programming. But most of the time, these tools can only take us so far. In practice, the smaller tools and skills really separate a great data scientist from a not-so-great one. Taken as a whole, the lessons in this book make the difference between an average data scientist candidate and a qualified data scientist working in the field. Author Daniel Vaughan has collected, extended, and used these skills to create value and train data scientists from different companies and industries. With this book, you will: Understand how data science creates value Deliver compelling narratives to sell your data science project Build a business case using unit economics principles Create new features for a ML model using storytelling Learn how to decompose KPIs Perform growth decompositions to find root causes for changes in a metric Daniel Vaughan is head of data at Clip, the leading paytech company in Mexico. He's the author of Analytical Skills for AI and Data Science (O'Reilly).

R Bioinformatics Cookbook - Second Edition

R Bioinformatics Cookbook is your guide to leveraging the power of R for advanced bioinformatics tasks. This updated second edition uses a recipe-based method to teach data analysis, visualization, and machine learning tailored for biological datasets. You'll gain hands-on experience with popular tools like Bioconductor, ggplot2, and tidyverse to solve real-world genomics problems. What this Book will help me do Set up a reproducible bioinformatics analysis environment using R. Clean, analyze, and visualize biological data with R's powerful packages. Apply RNA-seq and ChIP-seq workflows to study genetic information effectively. Incorporate machine learning techniques into bioinformatics pipelines using R. Automate tasks and create professional-grade reports using functional programming and reporting tools. Author(s) The author, None MacLean, brings years of expertise in bioinformatics and computational biology. Known for clear explanations and practical approaches, they ensure the material is accessible yet challenging. With a strong focus on real-world applications, this book reflects their commitment to bridging bioinformatics and modern data science. Who is it for? This book is perfect for bioinformaticians, researchers, and data scientists with prior R experience. It's tailored for those looking to delve deeper into genomics, data visualization, and bioinformatics techniques. Intermediate knowledge of bioinformatics concepts and familiarity with R programming are assumed for readers to fully benefit from the content.

The Statistics and Machine Learning with R Workshop

This book guides readers through the essentials of applied statistics and machine learning using the R programming language. By delving into robust data processing techniques, visualization, and statistical modeling with R, you will develop skills to effectively analyze data and design predictive models. Each chapter includes hands-on exercises to reinforce the concepts in a practical, intuitive way. What this Book will help me do Understand and apply key statistical concepts such as probability distributions and hypothesis testing to analyze data. Master foundational mathematical principles like linear algebra and calculus relevant to data science and machine learning. Develop proficiency in data manipulation and visualization using robust R libraries such as dplyr and ggplot2. Build predictive models through practical exercises and learn advanced concepts like Bayesian statistics and linear regression. Gain the practical knowledge needed to apply statistical and machine learning methodologies in real-world scenarios. Author(s) Liu Peng is an accomplished author with a strong academic and practical background in statistics and data science. Armed with extensive experience in applying R to real-world problems, he brings a blend of technical mastery and teaching expertise. His commitment is to transform complex concepts into accessible, enriching learning experiences for readers. Who is it for? This book is ideal for data scientists and analysts ranging from beginners to those at an intermediate level. It caters especially to those interested in practicing statistical modeling and learning R in depth. If you have basic familiarity with statistics and are looking to expand your data science capabilities using R, this book is well-suited for you.

Hands-On Web Scraping with Python - Second Edition

In "Hands-On Web Scraping with Python," you'll learn how to harness the power of Python libraries to extract, process, and analyze data from the web. This book provides a practical, step-by-step guide for beginners and data enthusiasts alike. What this Book will help me do Master the use of Python libraries like requests, lxml, Scrapy, and Beautiful Soup for web scraping. Develop advanced techniques for secure browsing and data extraction using APIs and Selenium. Understand the principles behind regex and PDF data parsing for comprehensive scraping. Analyze and visualize data using data science tools such as Pandas and Plotly. Build a portfolio of real-world scraping projects to demonstrate your capabilities. Author(s) Anish Chapagain, the author of "Hands-On Web Scraping with Python," is an experienced programmer and instructor who specializes in Python and data-related technologies. With his vast experience in teaching individuals from diverse backgrounds, Anish approaches complex concepts with clarity and a hands-on methodology. Who is it for? This book is perfect for aspiring data scientists, Python beginners, and anyone who wants to delve into web scraping. Readers should have a basic understanding of how websites work but no prior coding experience is required. If you aim to develop scraping skills and understand data analysis, this book is the ideal starting point.

Streamlit for Data Science - Second Edition

Streamlit for Data Science is your complete guide to mastering the creation of powerful, interactive data-driven applications using Python and Streamlit. With this comprehensive resource, you'll learn everything from foundational Streamlit skills to advanced techniques like integrating machine learning models and deploying apps to cloud platforms, enabling you to significantly enhance your data science toolkit. What this Book will help me do Master building interactive applications using Streamlit, including techniques for user interfaces and integrations. Develop visually appealing and functional data visualizations using Python libraries in Streamlit. Learn to integrate Streamlit applications with machine learning frameworks and tools like Hugging Face and OpenAI. Understand and apply best practices to deploy Streamlit apps to cloud platforms such as Streamlit Community Cloud and Heroku. Improve practical Python skills through implementing end-to-end data applications and prototyping data workflows. Author(s) Tyler Richards, the author of Streamlit for Data Science, is a senior data scientist with in-depth practical experience in building data-driven applications. With a passion for Python and data visualization, Tyler leverages his knowledge to help data professionals craft effective and compelling tools. His teaching approach combines clarity, hands-on exercises, and practical relevance. Who is it for? This book is written for data scientists, engineers, and enthusiasts who use Python and want to create dynamic data-driven applications. With a focus on those who have some familiarity with Python and libraries like Pandas or NumPy, it assists readers in building on their knowledge by offering tailored guidance. Perfect for those looking to prototype data projects or enhance their programming toolkit.

Learning Data Science

As an aspiring data scientist, you appreciate why organizations rely on data for important decisions—whether it's for companies designing websites, cities deciding how to improve services, or scientists discovering how to stop the spread of disease. And you want the skills required to distill a messy pile of data into actionable insights. We call this the data science lifecycle: the process of collecting, wrangling, analyzing, and drawing conclusions from data. Learning Data Science is the first book to cover foundational skills in both programming and statistics that encompass this entire lifecycle. It's aimed at those who wish to become data scientists or who already work with data scientists, and at data analysts who wish to cross the "technical/nontechnical" divide. If you have a basic knowledge of Python programming, you'll learn how to work with data using industry-standard tools like pandas. Refine a question of interest to one that can be studied with data Pursue data collection that may involve text processing, web scraping, etc. Glean valuable insights about data through data cleaning, exploration, and visualization Learn how to use modeling to describe the data Generalize findings beyond the data

Building Statistical Models in Python

Building Statistical Models in Python is your go-to guide for mastering statistical modeling techniques using Python. By reading this book, you will explore how to use Python libraries like stats models and others to tackle tasks such as regression, classification, and time series analysis. What this Book will help me do Develop a deep practical knowledge of statistical concepts and their implementation in Python. Create regression and classification models to solve real-world problems. Gain expertise analyzing time series data and generating valuable forecasts. Learn to perform hypothesis verification to interpret data correctly. Understand survival analysis and apply it in various industry scenarios. Author(s) Huy Hoang Nguyen, Paul N Adams, and Stuart J Miller bring their extensive expertise in data science and Python programming to the table. With years of professional experience in both industry and academia, they aim to make statistical modeling approachable and applicable. Combining technical depth with hands-on coding, their goal is to ensure readers not only understand the theory but also gain confidence in its application. Who is it for? This book is tailored for beginners and intermediate programmers seeking to learn statistical modeling without a prerequisite in mathematics. It's ideal for data analysts, data scientists, and Python enthusiasts who want to leverage statistical models to gain insights from data. With this book, you will journey from the basics to advanced applications, making it perfect for those who aim to master statistical analysis.

Good Charts, Updated and Expanded

The ultimate guide to data visualization and information design for business. Making good charts is a must-have skill for managers today. The vast amount of data that drives business isn't useful if you can't communicate the valuable ideas contained in that data—the threats, the opportunities, the hidden trends, the future possibilities. But many think that data visualization is too difficult—a specialist skill that's either the province of data scientists and complex software packages or the domain of professional designers and their visual creativity. Not so. Anyone can learn to produce quality "dataviz" and, more broadly, clear and effective information design. Good Charts will show you how to do it. In this updated and expanded edition, dataviz expert Scott Berinato provides all you need for turning those ordinary charts kicked out of a spreadsheet program into extraordinary visuals that captivate and persuade your audience and for transforming presentations that seem like a mishmash of charts and bullet points into clear, effective, persuasive storytelling experiences. Good Charts shows how anyone who invests a little time getting better at visual communication can create an outsized impact—both in their career and in their organization. You will learn: A framework for getting to better charts in just a few minutes Design techniques that immediately make your visuals clearer and more persuasive The building blocks of storytelling with your data How to build teams to bring visual communication skills into your organization and culture This new edition of Good Charts not only provides new visuals and updated concepts but adds an entirely new chapter on building teams around the visualization part of a data science operation and creating workflows to integrate visualization into everything you do. Graphics that merely present information won't cut it anymore. Make Good Charts your go-to resource for turning plain, uninspiring charts and presentations into smart, effective visualizations and stories that powerfully convey ideas.

M-statistics

M-STATISTICS A comprehensive resource providing new statistical methodologies and demonstrating how new approaches work for applications M-statistics introduces a new approach to statistical inference, redesigning the fundamentals of statistics, and improving on the classical methods we already use. This book targets exact optimal statistical inference for a small sample under one methodological umbrella. Two competing approaches are offered: maximum concentration (MC) and mode (MO) statistics combined under one methodological umbrella, which is why the symbolic equation M=MC+MO. M-statistics defines an estimator as the limit point of the MC or MO exact optimal confidence interval when the confidence level approaches zero, the MC and MO estimator, respectively. Neither mean nor variance plays a role in M-statistics theory. Novel statistical methodologies in the form of double-sided unbiased and short confidence intervals and tests apply to major statistical parameters: Exact statistical inference for small sample sizes is illustrated with effect size and coefficient of variation, the rate parameter of the Pareto distribution, two-sample statistical inference for normal variance, and the rate of exponential distributions. M-statistics is illustrated with discrete, binomial, and Poisson distributions. Novel estimators eliminate paradoxes with the classic unbiased estimators when the outcome is zero. Exact optimal statistical inference applies to correlation analysis including Pearson correlation, squared correlation coefficient, and coefficient of determination. New MC and MO estimators along with optimal statistical tests, accompanied by respective power functions, are developed. M-statistics is extended to the multidimensional parameter and illustrated with the simultaneous statistical inference for the mean and standard deviation, shape parameters of the beta distribution, the two-sample binomial distribution, and finally, nonlinear regression. Our new developments are accompanied by respective algorithms and R codes, available at GitHub, and as such readily available for applications. M-statistics is suitable for professionals and students alike. It is highly useful for theoretical statisticians and teachers, researchers, and data science analysts as an alternative to classical and approximate statistical inference.

Building Data Science Applications with FastAPI - Second Edition

Building Data Science Applications with FastAPI is your comprehensive guide to mastering the FastAPI framework to build efficient, reliable data science applications and APIs. You'll explore examples and projects that integrate machine learning models, manage databases, and leverage advanced FastAPI features like asynchronous I/O and WebSockets. What this Book will help me do Develop an understanding of the fundamentals and advanced features of the FastAPI framework, like dependency injection and type hinting. Learn how to integrate machine learning models into a FastAPI-based web backend effectively. Master concepts of authentication, database connections, and asynchronous programming in Python. Build and deploy two practical AI applications: a real-time object detection tool and a text-to-image generator. Acquire skills to monitor, log, and maintain software systems for optimal performance and reliability. Author(s) François Voron is an experienced Python developer and data scientist with extensive knowledge of western frameworks including FastAPI. With years of experience designing and deploying machine learning and data science applications, François focuses on empowering developers with practical techniques and real-world applications. His guidance helps readers tackle contemporary challenges in software development. Who is it for? This book is ideal for data scientists and software engineers looking to broaden their skillset by creating robust web APIs for data science applications. Readers are expected to have a working knowledge of Python and basic data science concepts, offering them a chance to expand into backend development. If you're keen to deploy machine learning models and integrate them seamlessly with web technologies, this book is for you. It provides both fundamental insights and advanced techniques to serve a broad range of learners.

Data Analytic Literacy

The explosive growth in volume and varieties of data generated by the seemingly endless arrays of digital systems and applications is rapidly elevating the importance of being able to utilize data; in fact, data analytic literacy is becoming as important now, at the onset of the Digital Era, as rudimentary literacy and numeracy were throughout the Industrial Era. And yet, what constitutes data analytic literacy is poorly understood. To some, data analytic literacy is the ability to use basic statistics, to others it is data science ‘light’, and to still others it is just general familiarity with common data analytic outcomes. Exploring the scope and the structure of rudimentary data analytic competencies is at the core of this book which takes the perspective that data analytics is a new and distinct domain of knowledge and practice. It offers application-minded framing of rudimentary data analytic competencies built around conceptually sound and practically meaningful processes and mechanics of systematically transforming messy and heterogeneous data into informative insights. Data Analytic Literacy is meant to offer an easy-to-follow overview of the critical elements of the reasoning behind basic data manipulation and analysis approaches and steps, coupled with the commonly used data analytic and data communication techniques and tools. It offers an all-inclusive guide to developing basic data analytic competencies.

Learn Enough Python to Be Dangerous: Software Development, Flask Web Apps, and Beginning Data Science with Python

All You Need to Know, and Nothing You Don't, to Solve Real Problems with Python Python is one of the most popular programming languages in the world, used for everything from shell scripts to web development to data science. As a result, Python is a great language to learn, but you don't need to learn "everything" to get started, just how to use it efficiently to solve real problems. In Learn Enough Python to Be Dangerous, renowned instructor Michael Hartl teaches the specific concepts, skills, and approaches you need to be professionally productive. Even if you've never programmed before, Hartl helps you quickly build technical sophistication and master the lore you need to succeed. Hartl introduces Python both as a general-purpose language and as a specialist tool for web development and data science, presenting focused examples and exercises that help you internalize what matters, without wasting time on details pros don't care about. Soon, it'll be like you were born knowing this stuff--and you'll be suddenly, seriously dangerous. Learn enough about . . . Applying core Python concepts with the interactive interpreter and command line Writing object-oriented code with Python's native objects Developing and publishing self-contained Python packages Using elegant, powerful functional programming techniques, including Python comprehensions Building new objects, and extending them via Test-Driven Development (TDD) Leveraging Python's exceptional shell scripting capabilities Creating and deploying a full web app, using routes, layouts, templates, and forms Getting started with data-science tools for numerical computations, data visualization, data analysis, and machine learning Mastering concrete and informal skills every developer needs Michael Hartl's Learn Enough Series includes books and video courses that focus on the most important parts of each subject, so you don't have to learn everything to get started--you just have to learn enough to be dangerous and solve technical problems yourself. Like this book? Don't miss Michael Hartl's companion video tutorial, Learn Enough Python to Be Dangerous LiveLessons. Register your book for convenient access to downloads, updates, and/or corrections as they become available. See inside book for details.

Dive Into Data Science

Dive into the exciting world of data science with this practical introduction. Packed with essential skills and useful examples, Dive Into Data Science will show you how to obtain, analyze, and visualize data so you can leverage its power to solve common business challenges. With only a basic understanding of Python and high school math, you’ll be able to effortlessly work through the book and start implementing data science in your day-to-day work. From improving a bike sharing company to extracting data from websites and creating recommendation systems, you’ll discover how to find and use data-driven solutions to make business decisions. Topics covered include conducting exploratory data analysis, running A/B tests, performing binary classification using logistic regression models, and using machine learning algorithms. You’ll also learn how to: •Forecast consumer demand •Optimize marketing campaigns •Reduce customer attrition •Predict website traffic •Build recommendation systems With this practical guide at your fingertips, harness the power of programming, mathematical theory, and good old common sense to find data-driven solutions that make a difference. Don’t wait; dive right in!

R for Data Science, 2nd Edition

Use R to turn data into insight, knowledge, and understanding. With this practical book, aspiring data scientists will learn how to do data science with R and RStudio, along with the tidyverse—a collection of R packages designed to work together to make data science fast, fluent, and fun. Even if you have no programming experience, this updated edition will have you doing data science quickly. You'll learn how to import, transform, and visualize your data and communicate the results. And you'll get a complete, big-picture understanding of the data science cycle and the basic tools you need to manage the details. Updated for the latest tidyverse features and best practices, new chapters show you how to get data from spreadsheets, databases, and websites. Exercises help you practice what you've learned along the way. You'll understand how to: Visualize: Create plots for data exploration and communication of results Transform: Discover variable types and the tools to work with them Import: Get data into R and in a form convenient for analysis Program: Learn R tools for solving data problems with greater clarity and ease Communicate: Integrate prose, code, and results with Quarto

Power BI Machine Learning and OpenAI

Microsoft Power BI Machine Learning and OpenAI offers a comprehensive exploration into advanced data analytics and artificial intelligence using Microsoft Power BI. Through hands-on, workshop-style examples, readers will discover the integration of machine learning models and OpenAI features to enhance business intelligence. This book provides practical examples, real-world scenarios, and step-by-step guidance. What this Book will help me do Learn to apply machine learning capabilities within Power BI to create predictive analytics Understand how to integrate OpenAI services to build enhanced analytics workflows Gain hands-on experience in using R and Python for advanced data visualization in Power BI Master the skills needed to build and deploy SaaS auto ML models within Power BI Leverage Power BI's AI visuals and features to elevate data storytelling Author(s) Greg Beaumont, an expert in data science and business intelligence, brings years of experience in Power BI and analytics to this book. With a focus on practical applications, Greg empowers readers to harness the power of AI and machine learning to elevate their data solutions. As a consultant and trainer, he shares his deep knowledge to help readers unlock the full potential of their tools. Who is it for? This book is ideal for data analysts, BI professionals, and data scientists who aim to integrate machine learning and OpenAI into their workflows. If you're familiar with Power BI's fundamentals and are eager to explore its advanced capabilities, this guide is tailored for you. Perfect for professionals looking to elevate their analytics to a new level, combining data science concepts with Power BI's features.

Forecasting Time Series Data with Prophet - Second Edition

Discover how to effectively forecast time series data using Prophet, the versatile open-source tool developed by Meta. Whether you're a business analyst or a machine learning expert, this book provides comprehensive insights into creating, diagnosing, and refining forecasting models. By mastering Prophet, you'll be equipped to make accurate predictions that drive decisions. What this Book will help me do Master the core principles of using Prophet for time series forecasting. Ensure your forecasts are accurate and robust for better decision-making. Gain experience in handling real-world forecasting challenges, like seasonality and outliers. Learn how to fine-tune and optimize models using additional regressors. Understand productionalization of forecasting models to apply solutions at scale. Author(s) Greg Rafferty is a seasoned data scientist specializing in time series analysis and machine learning. With years of practical experience building forecasting models in industries ranging from finance to e-commerce, Greg is dedicated to teaching accessible and actionable approaches to data science. Through clear explanations and practical examples, he empowers readers to solve challenging forecasting problems with confidence. Who is it for? Ideal for data scientists, business analysts, machine learning engineers, and software developers seeking to enhance their forecasting skills with Prophet. Whether you're familiar with time series concepts or just starting to explore forecasting methods, this book helps you advance from fundamental understanding to practical application of state-of-the-art techniques for impactful results.

Applied Geospatial Data Science with Python

"Applied Geospatial Data Science with Python" introduces readers to the power of integrating geospatial data into data science workflows. This book equips you with practical methods for processing, analyzing, and visualizing spatial data to solve real-world problems. Through hands-on examples and clear, actionable advice, you will master the art of spatial data analysis using Python. What this Book will help me do Learn to process, analyze, and visualize geospatial data using Python libraries. Develop a foundational understanding of GIS and geospatial data science principles. Gain skills in building geospatial AI and machine learning models for specific use cases. Apply geospatial data workflows to practical scenarios like optimization and clustering. Create a portfolio of geospatial data science projects relevant across different industries. Author(s) David S. Jordan is an experienced data scientist with years of expertise in GIS and geospatial analytics. With a passion for making complex topics accessible, David leverages his deep technical knowledge to provide practical, hands-on instruction. His approach emphasizes real-world applications and encourages learners to develop confidence as they work with geospatial data. Who is it for? This book is perfect for data scientists looking to integrate geospatial data analysis into their existing workflows, and GIS professionals seeking to expand into data science. If you already have a basic knowledge of Python for data analysis or data science and want to explore how to work effectively with geospatial data to drive impactful solutions, this is the book for you.

Leading Biotech Data Teams

With hundreds of startups founded each year, the relatively new field of data-focused biotech—or TechBio—is growing rapidly. But without enough experienced practitioners to go around, most organizations hire data scientists with minimal biotech experience and lab scientists who've taken a crash course in data science. This arrangement is problematic. The way lab scientists and data scientists think and work is fundamentally different. But there is a solution. This report introduces biocode principles to help these scientists reframe the way they think about their role, their team's role, and the tools they use to fulfill those roles. Lab and data scientists alike will learn how to address the underlying issues so they can focus on solving these technology problems together. Each of the following chapters presents a vital biocode principle: "Defining Objectives" explores how to broaden the way teams view their work, shifting from purely technical objectives to organizational-level scientific objectives "Building Collaborations" encourages teams to focus their energy on collaboration with partner teams rather than guard their time for technical work "Deploying Tooling" covers ways to coordinate each team's work with the cadence of experiments and lab work

The Kaggle Workbook

"The Kaggle Workbook" is an engaging and practical guide for anyone looking to excel in Kaggle competitions by learning from real past case studies and hands-on exercises. Inside, you'll dive deep into key data science concepts, explore how Kaggle Grandmasters tackle challenges, and apply new skills to your own projects. What this Book will help me do Master the methodology used in past Kaggle competitions for real-world applications. Discover and implement advanced data science techniques such as gradient boosting and NLP. Build a portfolio that demonstrates hands-on experience solving complex data problems. Learn time-series forecasting and computer vision by exploring detailed case studies. Develop a practical mindset for competitive data science problem solving. Author(s) Konrad Banachewicz and Luca Massaron bring their expertise as Kaggle Grandmasters to the pages of this book. With extensive experience in data science and collaborative problem-solving, they guide readers through practical exercises with a clear, approachable style. Their passion for sharing knowledge shines through in every chapter. Who is it for? "The Kaggle Workbook" is ideal for aspiring and experienced data scientists who want to sharpen their competitive data science skills. It caters to those with a foundational knowledge of data science and an interest in enhancing it through practical exercises. The book is a perfect fit for anyone aiming to succeed in Kaggle competitions, whether starting out or advancing further.

Experimentation for Engineers

Optimize the performance of your systems with practical experiments used by engineers in the world’s most competitive industries. In Experimentation for Engineers: From A/B testing to Bayesian optimization you will learn how to: Design, run, and analyze an A/B test Break the "feedback loops" caused by periodic retraining of ML models Increase experimentation rate with multi-armed bandits Tune multiple parameters experimentally with Bayesian optimization Clearly define business metrics used for decision-making Identify and avoid the common pitfalls of experimentation Experimentation for Engineers: From A/B testing to Bayesian optimization is a toolbox of techniques for evaluating new features and fine-tuning parameters. You’ll start with a deep dive into methods like A/B testing, and then graduate to advanced techniques used to measure performance in industries such as finance and social media. Learn how to evaluate the changes you make to your system and ensure that your testing doesn’t undermine revenue or other business metrics. By the time you’re done, you’ll be able to seamlessly deploy experiments in production while avoiding common pitfalls. About the Technology Does my software really work? Did my changes make things better or worse? Should I trade features for performance? Experimentation is the only way to answer questions like these. This unique book reveals sophisticated experimentation practices developed and proven in the world’s most competitive industries that will help you enhance machine learning systems, software applications, and quantitative trading solutions. About the Book Experimentation for Engineers: From A/B testing to Bayesian optimization delivers a toolbox of processes for optimizing software systems. You’ll start by learning the limits of A/B testing, and then graduate to advanced experimentation strategies that take advantage of machine learning and probabilistic methods. The skills you’ll master in this practical guide will help you minimize the costs of experimentation and quickly reveal which approaches and features deliver the best business results. What's Inside Design, run, and analyze an A/B test Break the “feedback loops” caused by periodic retraining of ML models Increase experimentation rate with multi-armed bandits Tune multiple parameters experimentally with Bayesian optimization About the Reader For ML and software engineers looking to extract the most value from their systems. Examples in Python and NumPy. About the Author David Sweet has worked as a quantitative trader at GETCO and a machine learning engineer at Instagram. He teaches in the AI and Data Science master's programs at Yeshiva University. Quotes Putting an ‘improved’ version of a system into production can be really risky. This book focuses you on what is important! - Simone Sguazza, University of Applied Sciences and Arts of Southern Switzerland A must-have for anyone setting up experiments, from A/B tests to contextual bandits and Bayesian optimization. - Maxim Volgin, KLM Shows a non-mathematical programmer exactly what they need to write powerful mathematically-based testing algorithms. - Patrick Goetz, The University of Texas at Austin Gives you the tools you need to get the most out of your experiments. - Marc-Anthony Taylor, Raiffeisen Bank International