talk-data.com talk-data.com

Topic

PyTorch

deep_learning machine_learning neural_networks

80

tagged

Activity Trend

16 peak/qtr
2020-Q1 2026-Q1

Activities

80 activities · Newest first

High Performance Spark, 2nd Edition

Apache Spark is amazing when everything clicks. But if you haven't seen the performance improvements you expected or still don't feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau, Rachel Warren, and Anya Bida walk you through the secrets of the Spark code base, and demonstrate performance optimizations that will help your data pipelines run faster, scale to larger datasets, and avoid costly antipatterns. Ideal for data engineers, software engineers, data scientists, and system administrators, the second edition of High Performance Spark presents new use cases, code examples, and best practices for Spark 3.x and beyond. This book gives you a fresh perspective on this continually evolving framework and shows you how to work around bumps on your Spark and PySpark journey. With this book, you'll learn how to: Accelerate your ML workflows with integrations including PyTorch Handle key skew and take advantage of Spark's new dynamic partitioning Make your code reliable with scalable testing and validation techniques Make Spark high performance Deploy Spark on Kubernetes and similar environments Take advantage of GPU acceleration with RAPIDS and resource profiles Get your Spark jobs to run faster Use Spark to productionize exploratory data science projects Handle even larger datasets with Spark Gain faster insights by reducing pipeline running times

Time Series Analysis with Python Cookbook - Second Edition

Perform time series analysis and forecasting confidently with this Python code bank and reference manual Purchase of the print or Kindle book includes a free PDF eBook Key Features Explore up-to-date forecasting and anomaly detection techniques using statistical, machine learning, and deep learning algorithms Learn different techniques for evaluating, diagnosing, and optimizing your models Work with a variety of complex data with trends, multiple seasonal patterns, and irregularities Book Description To use time series data to your advantage, you need to be well-versed in data preparation, analysis, and forecasting. This fully updated second edition includes chapters on probabilistic models and signal processing techniques, as well as new content on transformers. Additionally, you will leverage popular libraries and their latest releases covering Pandas, Polars, Sktime, stats models, stats forecast, Darts, and Prophet for time series with new and relevant examples. You'll start by ingesting time series data from various sources and formats, and learn strategies for handling missing data, dealing with time zones and custom business days, and detecting anomalies using intuitive statistical methods. Further, you'll explore forecasting using classical statistical models (Holt-Winters, SARIMA, and VAR). Learn practical techniques for handling non-stationary data, using power transforms, ACF and PACF plots, and decomposing time series data with multiple seasonal patterns. Then we will move into more advanced topics such as building ML and DL models using TensorFlow and PyTorch, and explore probabilistic modeling techniques. In this part, you’ll also learn how to evaluate, compare, and optimize models, making sure that you finish this book well-versed in wrangling data with Python. What you will learn Understand what makes time series data different from other data Apply imputation and interpolation strategies to handle missing data Implement an array of models for univariate and multivariate time series Plot interactive time series visualizations using hvPlot Explore state-space models and the unobserved components model (UCM) Detect anomalies using statistical and machine learning methods Forecast complex time series with multiple seasonal patterns Use conformal prediction for constructing prediction intervals for time series Who this book is for This book is for data analysts, business analysts, data scientists, data engineers, and Python developers who want practical Python recipes for time series analysis and forecasting techniques. Fundamental knowledge of Python programming is a prerequisite. Prior experience working with time series data to solve business problems will also help you to better utilize and apply the different recipes in this book.

AI Systems Performance Engineering

Elevate your AI system performance capabilities with this definitive guide to maximizing efficiency across every layer of your AI infrastructure. In today's era of ever-growing generative models, AI Systems Performance Engineering provides engineers, researchers, and developers with a hands-on set of actionable optimization strategies. Learn to co-optimize hardware, software, and algorithms to build resilient, scalable, and cost-effective AI systems that excel in both training and inference. Authored by Chris Fregly, a performance-focused engineering and product leader, this resource transforms complex AI systems into streamlined, high-impact AI solutions. Inside, you'll discover step-by-step methodologies for fine-tuning GPU CUDA kernels, PyTorch-based algorithms, and multinode training and inference systems. You'll also master the art of scaling GPU clusters for high performance, distributed model training jobs, and inference servers. The book ends with a 175+-item checklist of proven, ready-to-use optimizations. Codesign and optimize hardware, software, and algorithms to achieve maximum throughput and cost savings Implement cutting-edge inference strategies that reduce latency and boost throughput in real-world settings Utilize industry-leading scalability tools and frameworks Profile, diagnose, and eliminate performance bottlenecks across complex AI pipelines Integrate full stack optimization techniques for robust, reliable AI system performance

AI/ML workloads depend heavily on complex software stacks, including numerical computing libraries (SciPy, NumPy), deep learning frameworks (PyTorch, TensorFlow), and specialized toolchains (CUDA, cuDNN). However, integrating these dependencies into Bazel-based workflows remains challenging due to compatibility issues, dependency resolution, and performance optimization. This session explores the process of creating and maintaining Bazel packages for key AI/ML libraries, ensuring reproducibility, performance, and ease of use for researchers and engineers.

Hands-On Machine Learning with Scikit-Learn and PyTorch

The potential of machine learning today is extraordinary, yet many aspiring developers and tech professionals find themselves daunted by its complexity. Whether you're looking to enhance your skill set and apply machine learning to real-world projects or are simply curious about how AI systems function, this book is your jumping-off place. With an approachable yet deeply informative style, author Aurélien Géron delivers the ultimate introductory guide to machine learning and deep learning. Drawing on the Hugging Face ecosystem, with a focus on clear explanations and real-world examples, the book takes you through cutting-edge tools like Scikit-Learn and PyTorch—from basic regression techniques to advanced neural networks. Whether you're a student, professional, or hobbyist, you'll gain the skills to build intelligent systems. Understand ML basics, including concepts like overfitting and hyperparameter tuning Complete an end-to-end ML project using scikit-Learn, covering everything from data exploration to model evaluation Learn techniques for unsupervised learning, such as clustering and anomaly detection Build advanced architectures like transformers and diffusion models with PyTorch Harness the power of pretrained models—including LLMs—and learn to fine-tune them Train autonomous agents using reinforcement learning

torchFastText: Modernizing Text Classification at Insee with PyTorch-based models

Discover how Insee transitioned from fastText to a PyTorch-based model for text classification by developing and open-sourcing the torchFastText package. This presentation will cover the creation, deployment, and practical applications of torchFastText in modernizing automatic coding systems, benefiting Insee and other European National Statistical Institutes (NSIs).

Most common machine learning models (linear, tree-based or neural network-based), optimize for the least squares loss when trained for regression tasks. As a result, they output a point estimate of the conditional expected value of the target: E[y|X].

In this presentation, we will explore several ways to train and evaluate probabilistic regression models as a richer alternative to point estimates. Those models predict a richer description of the full distribution of y|X and allow us to quantify the predictive uncertainty for individual predictions.

On the model training part, we will introduce the following options:

  • ensemble of quantile regressors for a grid of quantile levels (using linear models or gradient boosted trees in scikit-learn, XGBoost and PyTorch),
  • how to reduce probabilistic regression to multi-class classification + a cumulative sum of the predict_proba output to recover a continuous conditional CDF.
  • how to implement this approach as a generic scikit-learn meta-estimator;
  • how this approach is used to pretrain foundational tabular models (e.g. TabPFNv2).
  • simple Bayesian models (e.g. Bayesian Ridge and Gaussian Processes);
  • more specialized approaches as implemented in XGBoostLSS.

We will also discuss how to evaluate probabilistic predictions via:

  • the pinball loss of quantile regressors,
  • other strictly proper scoring rules such as Continuous Ranked Probability Score (CRPS),
  • coverage measures and width of prediction intervals,
  • reliability diagrams for different quantile levels.

We will illustrate of those concepts with concrete examples and running code.

Finally, we will illustrate why some applications need such calibrated probabilistic predictions:

  • estimating uncertainty in trip times depending on traffic conditions to help a human decision make choose among various travel plan options.
  • modeling value at risk for investment decisions,
  • assessing the impact of missing variables for an ML model trained to work in degraded mode,
  • Bayesian optimization for operational parameters of industrial machines from little/costly observations.

If time allows, will also discuss usage and limitations of Conformal Quantile Regressors as implemented in MAPIE and contrast aleatoric vs epistemic uncertainty captured by those models.

Tackling Domain Shift with SKADA: A Hands-On Guide to Domain Adaptation

Domain adaptation addresses the challenge of applying ML models to data that differs from the training distribution—a common issue in real-world applications. SKADA is a new Python library that brings domain adaptation tools to the sci-kit-learn and PyTorch ecosystem. This talk covers SKADA’s design, its integration with standard ML workflows, and how it helps practitioners build models that generalize better across domains.

A Hitchhiker's Guide to the Array API Standard Ecosystem

The array API standard is unifying the ecosystem of Python array computing, facilitating greater interoperability between code written for different array libraries, including NumPy, CuPy, PyTorch, JAX, and Dask.

But what are all of these "array-api-" libraries for? How can you use these libraries to 'future-proof' your libraries, and provide support for GPU and distributed arrays to your users? Find out in this talk, where I'll guide you through every corner of the array API standard ecosystem, explaining how SciPy and scikit-learn are using all of these tools to adopt the standard. I'll also be sharing progress updates from the past year, to give you a clear picture of where we are now, and what the future holds.

Deep Learning with Python, Third Edition

The bestselling book on Python deep learning, now covering generative AI, Keras 3, PyTorch, and JAX! Deep Learning with Python, Third Edition puts the power of deep learning in your hands. This new edition includes the latest Keras and TensorFlow features, generative AI models, and added coverage of PyTorch and JAX. Learn directly from the creator of Keras and step confidently into the world of deep learning with Python. In Deep Learning with Python, Third Edition you’ll discover: Deep learning from first principles The latest features of Keras 3 A primer on JAX, PyTorch, and TensorFlow Image classification and image segmentation Time series forecasting Large Language models Text classification and machine translation Text and image generation—build your own GPT and diffusion models! Scaling and tuning models With over 100,000 copies sold, Deep Learning with Python makes it possible for developers, data scientists, and machine learning enthusiasts to put deep learning into action. In this expanded and updated third edition, Keras creator François Chollet offers insights for both novice and experienced machine learning practitioners. You'll master state-of-the-art deep learning tools and techniques, from the latest features of Keras 3 to building AI models that can generate text and images. About the Technology In less than a decade, deep learning has changed the world—twice. First, Python-based libraries like Keras, TensorFlow, and PyTorch elevated neural networks from lab experiments to high-performance production systems deployed at scale. And now, through Large Language Models and other generative AI tools, deep learning is again transforming business and society. In this new edition, Keras creator François Chollet invites you into this amazing subject in the fluid, mentoring style of a true insider. About the Book Deep Learning with Python, Third Edition makes the concepts behind deep learning and generative AI understandable and approachable. This complete rewrite of the bestselling original includes fresh chapters on transformers, building your own GPT-like LLM, and generating images with diffusion models. Each chapter introduces practical projects and code examples that build your understanding of deep learning, layer by layer. What's Inside Hands-on, code-first learning Comprehensive, from basics to generative AI Intuitive and easy math explanations Examples in Keras, PyTorch, JAX, and TensorFlow About the Reader For readers with intermediate Python skills. No previous experience with machine learning or linear algebra required. About the Authors François Chollet is the co-founder of Ndea and the creator of Keras. Matthew Watson is a software engineer at Google working on Gemini and a core maintainer of Keras. Quotes Perfect for anyone interested in learning by doing from one of the industry greats. - Anthony Goldbloom, Founder of Kaggle A sharp, deeply practical guide that teaches you how to think from first principles to build models that actually work. - Santiago Valdarrama, Founder of ml.school The most up-to-date and complete guide to deep learning you’ll find today! - Aran Komatsuzaki, EleutherAI Masterfully conveys the true essence of neural networks. A rare case in recent years of outstanding technical writing. - Salvatore Sanfilippo, Creator of Redis