talk-data.com

Topic

Docker

containerization devops virtualization

Activities

tagged

Activity Trend

14 peak/qtr

2020-Q1 2026-Q1

Top Events

O'Reilly Data Engineering Books 25 Data Engineering Podcast 15 Google Cloud Next '25 5 O'Reilly Data Science Books 4 DataTalks.Club 4 Live Workshop: Automate Image Builds using Pulumi and Docker Build Cloud 3 Microsoft Ignite 2025 3 Airflow Summit 2023 3 Airflow Summit 2020 3 The Pragmatic Engineer 2 Build & Learn: Data Science with Coffee 2 Docker Fundamentals & Optimisations Workshop (FLINTA only event) 2

Top Speakers

Tobias Macey 15 Lindsey 6 Diana Esteves (Pulumi) 5 Michael Irwin (Docker) 5 Michael YenChi Ho (Microsoft) 3 Brian Redmond (Microsoft) 3 Ramcharan Kakarla 2 Jarek Potiuk (Apache Software Foundation) 2 Eben Hewitt 2 Kaxil Naik 2 Marinka (Spiced Academy) 2 Gergely Orosz 2

Activities

Showing filtered results

All Video Podcast Book

Filtering by: O'Reilly Data Science Books ×

Getting Started with Taipy

2025-10-06 · O'Reilly Data Science Books O'Reilly Amazon

book

by Eric Narro

AI/ML BI Cloud Computing KPI Pandas Python data data-science data-science-tools

Share your machine learning models, create chatbots, as well as build and deploy insightful dashboards speedily using Taipy with this hands-on book featuring real-world application examples from multiple industries Free with your book: DRM-free PDF version + access to Packt's next-gen Reader Key Features Create visually compelling, interactive data applications with Taipy Bring predictive models to end users and create data pipelines to compare scenarios with what-if analyses Go beyond prototypes to build and deploy production-ready applications using the cloud provider of your choice Purchase of the print or Kindle book includes a free PDF eBook in full color Book Description While data analysts, data scientists, and BI experts have the tools to analyze data, build models, and create compelling visuals, they often struggle to translate these insights into practical, user-friendly applications that help end users answer real-world questions, such as identifying revenue trends, predicting inventory needs, or detecting fraud, without wading through complex code. This book is a comprehensive guide to overcoming this challenge. This book teaches you how to use Taipy, a powerful open-source Python library, to build intuitive, production-ready data apps quickly and efficiently. Instead of creating prototypes that nobody uses, you'll learn how to build faster applications that process large amounts of data for multiple users and deliver measurable business impact. Taipy does the heavy lifting to enable your users to visualize their KPIs, interact with charts and maps, and compare scenarios for better decision-making. You’ll learn to use Taipy to build apps that make your data accessible and actionable in production environments like the cloud or Docker. By the end of this book, you won’t just understand Taipy, you'll be able to transform your data skills into impactful solutions that address real-world needs and deliver valuable insights. Email sign-up and proof of purchase required What you will learn Explore Taipy, its use cases, and how it's different from other projects Discover how to create visually appealing interactive apps, display KPIs, charts, and maps Understand how to compare scenarios to make better decisions Connect Taipy applications to several data sources and services Develop apps for diverse use cases, including chatbots, dashboards, ML apps, and maps Deploy Taipy applications on different types of servers and services Master advanced concepts for simplifying and accelerating your development workflow Who this book is for If you’re a data analyst, data scientist, or BI analyst looking to build production-ready data apps entirely in Python, this book is for you. If your scripts and models sit idle because non-technical stakeholders can’t use them, this book shows you how to turn them into full applications fast with Taipy, so your work delivers real business value. It’s also valuable for developers and engineers who want to streamline their data workflows and build UIs in pure Python.

Effective Data Science Infrastructure

2022-08-09 · O'Reilly Data Science Books O'Reilly Amazon

book

by Ville Tuulos

AI/ML Analytics AWS Cloud Computing Data Science MLOps NumPy Python data data-science

Simplify data science infrastructure to give data scientists an efficient path from prototype to production. In Effective Data Science Infrastructure you will learn how to: Design data science infrastructure that boosts productivity Handle compute and orchestration in the cloud Deploy machine learning to production Monitor and manage performance and results Combine cloud-based tools into a cohesive data science environment Develop reproducible data science projects using Metaflow, Conda, and Docker Architect complex applications for multiple teams and large datasets Customize and grow data science infrastructure Effective Data Science Infrastructure: How to make data scientists more productive is a hands-on guide to assembling infrastructure for data science and machine learning applications. It reveals the processes used at Netflix and other data-driven companies to manage their cutting edge data infrastructure. In it, you’ll master scalable techniques for data storage, computation, experiment tracking, and orchestration that are relevant to companies of all shapes and sizes. You’ll learn how you can make data scientists more productive with your existing cloud infrastructure, a stack of open source software, and idiomatic Python. The author is donating proceeds from this book to charities that support women and underrepresented groups in data science. About the Technology Growing data science projects from prototype to production requires reliable infrastructure. Using the powerful new techniques and tooling in this book, you can stand up an infrastructure stack that will scale with any organization, from startups to the largest enterprises. About the Book Effective Data Science Infrastructure teaches you to build data pipelines and project workflows that will supercharge data scientists and their projects. Based on state-of-the-art tools and concepts that power data operations of Netflix, this book introduces a customizable cloud-based approach to model development and MLOps that you can easily adapt to your company’s specific needs. As you roll out these practical processes, your teams will produce better and faster results when applying data science and machine learning to a wide array of business problems. What's Inside Handle compute and orchestration in the cloud Combine cloud-based tools into a cohesive data science environment Develop reproducible data science projects using Metaflow, AWS, and the Python data ecosystem Architect complex applications that require large datasets and models, and a team of data scientists About the Reader For infrastructure engineers and engineering-minded data scientists who are familiar with Python. About the Author At Netflix, Ville Tuulos designed and built Metaflow, a full-stack framework for data science. Currently, he is the CEO of a startup focusing on data science infrastructure. Quotes By reading and referring to this book, I’m confident you will learn how to make your machine learning operations much more efficient and productive. - From the Foreword by Travis Oliphant, Author of NumPy, Founder of Anaconda, PyData, and NumFOCUS Effective Data Science Infrastructure is a brilliant book. It’s a must-have for every data science team. - Ninoslav Cerkez, Logit More data science. Less headaches. - Dr. Abel Alejandro Coronado Iruegas, National Institute of Statistics and Geography of Mexico Indispensable. A copy should be on every data engineer’s bookshelf. - Matthew Copple, Grand River Analytics

Data Science at the Command Line, 2nd Edition

2021-08-17 · O'Reilly Data Science Books O'Reilly Amazon

book

by Jeroen Janssens

Agile/Scrum API CSV Data Science HTML JSON Linux Python Spark Unix XML data +1 more

This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed with over 100 Unix power tools--useful whether you work with Windows, macOS, or Linux. You'll quickly discover why the command line is an agile, scalable, and extensible technology. Even if you're comfortable processing data with Python or R, you'll learn how to greatly improve your data science workflow by leveraging the command line's power. This book is ideal for data scientists, analysts, engineers, system administrators, and researchers. Obtain data from websites, APIs, databases, and spreadsheets Perform scrub operations on text, CSV, HTML, XML, and JSON files Explore data, compute descriptive statistics, and create visualizations Manage your data science workflow Create your own tools from one-liners and existing Python or R code Parallelize and distribute data-intensive pipelines Model data with dimensionality reduction, regression, and classification algorithms Leverage the command line from Python, Jupyter, R, RStudio, and Apache Spark

Data Science with Python and Dask

2019-07-18 · O'Reilly Data Science Books O'Reilly Amazon

book

by Jesse Daniel

AI/ML Analytics AWS Cloud Computing Data Science NumPy Pandas PySpark Python Scikit-learn Seaborn dask +3 more

Dask is a native parallel analytics tool designed to integrate seamlessly with the libraries you’re already using, including Pandas, NumPy, and Scikit-Learn. With Dask you can crunch and work with huge datasets, using the tools you already have. And Data Science with Python and Dask is your guide to using Dask for your data projects without changing the way you work! About the Technology An efficient data pipeline means everything for the success of a data science project. Dask is a flexible library for parallel computing in Python that makes it easy to build intuitive workflows for ingesting and analyzing large, distributed datasets. Dask provides dynamic task scheduling and parallel collections that extend the functionality of NumPy, Pandas, and Scikit-learn, enabling users to scale their code from a single laptop to a cluster of hundreds of machines with ease. About the Book Data Science with Python and Dask teaches you to build scalable projects that can handle massive datasets. After meeting the Dask framework, you’ll analyze data in the NYC Parking Ticket database and use DataFrames to streamline your process. Then, you’ll create machine learning models using Dask-ML, build interactive visualizations, and build clusters using AWS and Docker. What's Inside Working with large, structured and unstructured datasets Visualization with Seaborn and Datashader Implementing your own algorithms Building distributed apps with Dask Distributed Packaging and deploying Dask apps About the Reader For data scientists and developers with experience using Python and the PyData stack. About the Author Jesse Daniel is an experienced Python developer. He taught Python for Data Science at the University of Denver and leads a team of data scientists at a Denver-based media technology company. We interviewed Jesse as a part of our Six Questions series. Check it out here. Quotes The most comprehensive coverage of Dask to date, with real-world examples that made a difference in my daily work. - Al Krinker, United States Patent and Trademark Office An excellent alternative to PySpark for those who are not on a cloud platform. The author introduces Dask in a way that speaks directly to an analyst. - Jeremy Loscheider, Panera Bread A greatly paced introduction to Dask with real-world datasets. - George Thomas, R&D Architecture Manhattan Associates The ultimate resource to quickly get up and running with Dask and parallel processing in Python. - Gustavo Patino, Oakland University William Beaumont School of Medicine