talk-data.com talk-data.com

Topic

Seaborn

data_visualization statistical_graphics python

14

tagged

Activity Trend

1 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: O'Reilly Data Science Books ×
Data Without Labels

Discover all-practical implementations of the key algorithms and models for handling unlabeled data. Full of case studies demonstrating how to apply each technique to real-world problems. In Data Without Labels you’ll learn: Fundamental building blocks and concepts of machine learning and unsupervised learning Data cleaning for structured and unstructured data like text and images Clustering algorithms like K-means, hierarchical clustering, DBSCAN, Gaussian Mixture Models, and Spectral clustering Dimensionality reduction methods like Principal Component Analysis (PCA), SVD, Multidimensional scaling, and t-SNE Association rule algorithms like aPriori, ECLAT, SPADE Unsupervised time series clustering, Gaussian Mixture models, and statistical methods Building neural networks such as GANs and autoencoders Dimensionality reduction methods like Principal Component Analysis and multidimensional scaling Association rule algorithms like aPriori, ECLAT, and SPADE Working with Python tools and libraries like sci-kit learn, numpy, Pandas, matplotlib, Seaborn, Keras, TensorFlow, and Flask How to interpret the results of unsupervised learning Choosing the right algorithm for your problem Deploying unsupervised learning to production Maintenance and refresh of an ML solution Data Without Labels introduces mathematical techniques, key algorithms, and Python implementations that will help you build machine learning models for unannotated data. You’ll discover hands-off and unsupervised machine learning approaches that can still untangle raw, real-world datasets and support sound strategic decisions for your business. Don’t get bogged down in theory—the book bridges the gap between complex math and practical Python implementations, covering end-to-end model development all the way through to production deployment. You’ll discover the business use cases for machine learning and unsupervised learning, and access insightful research papers to complete your knowledge. About the Technology Generative AI, predictive algorithms, fraud detection, and many other analysis tasks rely on cheap and plentiful unlabeled data. Machine learning on data without labels—or unsupervised learning—turns raw text, images, and numbers into insights about your customers, accurate computer vision, and high-quality datasets for training AI models. This book will show you how. About the Book Data Without Labels is a comprehensive guide to unsupervised learning, offering a deep dive into its mathematical foundations, algorithms, and practical applications. It presents practical examples from retail, aviation, and banking using fully annotated Python code. You’ll explore core techniques like clustering and dimensionality reduction along with advanced topics like autoencoders and GANs. As you go, you’ll learn where to apply unsupervised learning in business applications and discover how to develop your own machine learning models end-to-end. What's Inside Master unsupervised learning algorithms Real-world business applications Curate AI training datasets Explore autoencoders and GANs applications About the Reader Intended for data science professionals. Assumes knowledge of Python and basic machine learning. About the Author Vaibhav Verdhan is a seasoned data science professional with extensive experience working on data science projects in a large pharmaceutical company. Quotes An invaluable resource for anyone navigating the complexities of unsupervised learning. A must-have. - Ganna Pogrebna, The Alan Turing Institute Empowers the reader to unlock the hidden potential within their data. - Sonny Shergill, Astra Zeneca A must-have for teams working with unstructured data. Cuts through the fog of theory ili Explains the theory and delivers practical solutions. - Leonardo Gomes da Silva, onGRID Sports Technology The Bible for unsupervised learning! Full of real-world applications, clear explanations, and excellent Python implementations. - Gary Bake, Falconhurst Technologies

Pandas for Everyone: Python Data Analysis, 2nd Edition

Manage and Automate Data Analysis with Pandas in Python Today, analysts must manage data characterized by extraordinary variety, velocity, and volume. Using the open source Pandas library, you can use Python to rapidly automate and perform virtually any data analysis task, no matter how large or complex. Pandas can help you ensure the veracity of your data, visualize it for effective decision-making, and reliably reproduce analyses across multiple data sets. Pandas for Everyone, 2nd Edition, brings together practical knowledge and insight for solving real problems with Pandas, even if youre new to Python data analysis. Daniel Y. Chen introduces key concepts through simple but practical examples, incrementally building on them to solve more difficult, real-world data science problems such as using regularization to prevent data overfitting, or when to use unsupervised machine learning methods to find the underlying structure in a data set. New features to the second edition include: Extended coverage of plotting and the seaborn data visualization library Expanded examples and resources Updated Python 3.9 code and packages coverage, including statsmodels and scikit-learn libraries Online bonus material on geopandas, Dask, and creating interactive graphics with Altair Chen gives you a jumpstart on using Pandas with a realistic data set and covers combining data sets, handling missing data, and structuring data sets for easier analysis and visualization. He demonstrates powerful data cleaning techniques, from basic string manipulation to applying functions simultaneously across dataframes. Once your data is ready, Chen guides you through fitting models for prediction, clustering, inference, and exploration. He provides tips on performance and scalability and introduces you to the wider Python data analysis ecosystem. Work with DataFrames and Series, and import or export data Create plots with matplotlib, seaborn, and pandas Combine data sets and handle missing data Reshape, tidy, and clean data sets so theyre easier to work with Convert data types and manipulate text strings Apply functions to scale data manipulations Aggregate, transform, and filter large data sets with groupby Leverage Pandas advanced date and time capabilities Fit linear models using statsmodels and scikit-learn libraries Use generalized linear modeling to fit models with different response variables Compare multiple models to select the best one Regularize to overcome overfitting and improve performance Use clustering in unsupervised machine learning ...

The Art of Data-Driven Business

Learn how to integrate data-driven methodologies and machine learning into your business decision-making processes with 'The Art of Data-Driven Business.' This comprehensive guide shows you how to apply Python-based machine learning techniques to real-world challenges, transforming your organization into an innovative and well-informed enterprise. What this Book will help me do Create professional-quality data visualizations using Python's seaborn library to derive business insights. Analyze customer behavior, including predicting churn, with machine learning techniques. Apply clustering algorithms to segment customers for targeted marketing campaigns. Utilize pandas effectively for pricing and sales analytics to optimize your pricing strategies. Forecast outcomes of promotional strategies to determine costs and benefits and maximize performance. Author(s) None Palacio is an experienced data scientist and educator who specializes in the application of machine learning to solve business problems. With extensive real-world industry experience, Palacio brings practical insights and methodologies to learners. Their teaching connects technical knowledge to actionable business strategies. Who is it for? This book is ideal for business professionals aiming to incorporate data science into their strategies and technical experts seeking to leverage machine learning for business scenarios. Beginners to Python can find foundational help, while data scientists will appreciate the focused practical applications. It's perfect for individuals seeking a strong data-driven perspective in marketing, sales, and customer management.

Hands-on Matplotlib: Learn Plotting and Visualizations with Python 3

Learn the core aspects of NumPy, Matplotlib, and Pandas, and use them to write programs with Python 3. This book focuses heavily on various data visualization techniques and will help you acquire expert-level knowledge of working with Matplotlib, a MATLAB-style plotting library for Python programming language that provides an object-oriented API for embedding plots into applications. You'll begin with an introduction to Python 3 and the scientific Python ecosystem. Next, you'll explore NumPy and ndarray data structures, creation routines, and data visualization. You'll examine useful concepts related to style sheets, legends, and layouts, followed by line, bar, and scatter plots. Chapters then cover recipes of histograms, contours, streamplots, and heatmaps, and how to visualize images and audio with pie and polar charts. Moving forward, you'll learn how to visualize with pcolor, pcolormesh, and colorbar, and how to visualize in 3D in Matplotlib, create simple animations, and embed Matplotlib with different frameworks. The concluding chapters cover how to visualize data with Pandas and Matplotlib, Seaborn, and how to work with the real-life data and visualize it. After reading Hands-on Matplotlib you'll be proficient with Matplotlib and able to comfortably work with ndarrays in NumPy and data frames in Pandas. What You'll Learn Understand Data Visualization and Python using Matplotlib Review the fundamental data structures in NumPy and Pandas Work with 3D plotting, visualizations, and animations Visualize images and audio data Who This Book Is For Data scientists, machine learning engineers and software professionals with basic programming skills.

Hands-On Data Analysis with Pandas - Second Edition

'Hands-On Data Analysis with Pandas' guides you to gain expertise in the Python pandas library for data analysis and manipulation. With practical, real-world examples, you'll learn to analyze datasets, visualize data trends, and implement machine learning models for actionable insights. What this Book will help me do Understand and implement data analysis techniques with Python. Develop expertise in data manipulation using pandas and NumPy. Visualize data effectively with pandas visualization tools and seaborn. Apply machine learning techniques with Python libraries. Combine datasets and handle complex data workflows efficiently. Author(s) Stefanie Molin is a software engineer and data scientist with extensive experience in analytics and Python. She has worked with large data-driven systems and has a strong focus on teaching data analysis effectively. Stefanie's books are known for their practical, hands-on approach to solving real data problems. Who is it for? This book is perfect for aspiring data scientists, data analysts, and Python developers. Readers with beginner to intermediate skill levels in Python will find it accessible and informative. It is designed for those seeking to build practical data analysis skills. If you're looking to add data science and pandas to your toolkit, this book is ideal.

The Data Analysis Workshop

The Data Analysis Workshop teaches you how to analyze and interpret data to solve real-world business problems effectively. By working through practical examples and datasets, you'll gain actionable insights into modern analytic techniques and build your confidence as a data analyst. What this Book will help me do Understand and apply fundamental data analysis concepts and techniques to tackle diverse datasets. Perform rigorous hypothesis testing and analyze group differences within data sets. Create informative data visualizations using Python libraries like Matplotlib and Seaborn. Understand and use correlation metrics to identify relationships between variables. Leverage advanced data manipulation techniques to uncover hidden patterns in complex datasets. Author(s) The authors, Gururajan Govindan, Shubhangi Hora, and Konstantin Palagachev, are experts in data science and analytics with years of experience in industry and academia. Their background includes performing business-critical analysis for companies and teaching students how to approach data-driven decision-making. They bring their depth of knowledge and engaging teaching styles together in this approachable guide. Who is it for? This book is intended for programmers with proficiency in Python who want to apply their skills to the field of data analysis. Readers who have a foundational understanding of coding and are eager to implement hands-on data science techniques will gain the most value. The content is also suitable for anyone pursuing a data-driven problem-solving mindset. This is an excellent resource to help transition from basic coding proficiency to applying Python in real-world data science.

The Applied Data Science Workshop - Second Edition

Embark on an interactive journey into the world of data science with 'The Applied Data Science Workshop'. By following real-world scenarios and hands-on exercises, you will explore the fundamentals of data analysis and machine learning modeling within Jupyter Notebooks, leveraging Python libraries like pandas and sci-kit learn to draw meaningful insights from data. What this Book will help me do Master the process of setting up and using Jupyter Notebooks effectively for data science tasks. Learn to preprocess, analyze, and visualize data using Python libraries such as pandas, Matplotlib, and Seaborn. Discover methods to train and evaluate machine learning models using real-world data scenarios. Apply techniques to assess model performance and optimize them with advanced validation. Gain the skills to communicate insights through well-documented analyses and stakeholder-ready reports. Author(s) None Galea, an accomplished author in the data science domain, focuses on making technical concepts understandable and relatable. With this book, Galea leverages years of experience to introduce readers to practical applications of data science using Python. The author's approach ensures that readers not only learn the concepts but also apply them hands-on. Who is it for? This book caters to aspiring data scientists and developers interested in data analysis and practical applications of data science techniques. Beginners will find the step-by-step methodology approachable, while those with a basic understanding of Python programming or machine learning can quickly extend their skills. It suits anyone eager to apply data science in their professional toolbox.

Pandas 1.x Cookbook - Second Edition

The 'Pandas 1.x Cookbook' offers a recipe-based guide for mastering the powerful Python library, pandas. You will gain practical knowledge for handling and manipulating data efficiently, from the fundamentals to advanced techniques. The book is an essential resource for exploring and analyzing datasets with pandas. What this Book will help me do Understand and apply data exploration techniques in pandas. Use pandas to manipulate, aggregate, and clean datasets to extract meaningful insights. Combine pandas with Matplotlib and Seaborn to create effective visualizations. Perform time series analysis and transform datasets for machine learning. Implement workflows for handling large-scale data that exceeds your computer's memory. Author(s) Matthew Harrison and Theodore Petrou are highly experienced educators and practitioners in data science and Python programming. With their extensive expertise in using pandas, they provide insights through practical exercises and approachable narratives. Their aim is to make complex concepts accessible to learners of varying skill levels. Who is it for? This book is ideal for Python programmers, analysts, and data scientists seeking to expand their data handling and analysis capabilities. It caters to both beginners who are new to pandas and those looking to deepen their understanding of its advanced features. If your goal is to explore, clean, and analyze complex datasets efficiently, this book is tailored for you.

Hands-On Data Analysis with Pandas

Hands-On Data Analysis with Pandas provides an intensive dive into mastering the pandas library for data science and analysis using Python. Through a combination of conceptual explanations and practical demonstrations, readers will learn how to manipulate, visualize, and analyze data efficiently. What this Book will help me do Understand and apply the pandas library for efficient data manipulation. Learn to perform data wrangling tasks such as cleaning and reshaping datasets. Create effective visualizations using pandas and libraries like matplotlib and seaborn. Grasp the basics of machine learning and implement solutions with scikit-learn. Develop reusable data analysis scripts and modules in Python. Author(s) Stefanie Molin is a seasoned data scientist and software engineer with extensive experience in Python and data analytics. She specializes in leveraging the latest data science techniques to solve real-world problems. Her engaging and detailed writing draws from her practical expertise, aiming to make complex concepts accessible to all. Who is it for? This book is ideal for data analysts and aspiring data scientists who are at the beginning stages of their careers or looking to enhance their toolset with pandas and Python. It caters to Python developers eager to delve into data analysis workflows. Readers should have some programming knowledge to fully benefit from the examples and exercises.

Data Science with Python and Dask

Dask is a native parallel analytics tool designed to integrate seamlessly with the libraries you’re already using, including Pandas, NumPy, and Scikit-Learn. With Dask you can crunch and work with huge datasets, using the tools you already have. And Data Science with Python and Dask is your guide to using Dask for your data projects without changing the way you work! About the Technology An efficient data pipeline means everything for the success of a data science project. Dask is a flexible library for parallel computing in Python that makes it easy to build intuitive workflows for ingesting and analyzing large, distributed datasets. Dask provides dynamic task scheduling and parallel collections that extend the functionality of NumPy, Pandas, and Scikit-learn, enabling users to scale their code from a single laptop to a cluster of hundreds of machines with ease. About the Book Data Science with Python and Dask teaches you to build scalable projects that can handle massive datasets. After meeting the Dask framework, you’ll analyze data in the NYC Parking Ticket database and use DataFrames to streamline your process. Then, you’ll create machine learning models using Dask-ML, build interactive visualizations, and build clusters using AWS and Docker. What's Inside Working with large, structured and unstructured datasets Visualization with Seaborn and Datashader Implementing your own algorithms Building distributed apps with Dask Distributed Packaging and deploying Dask apps About the Reader For data scientists and developers with experience using Python and the PyData stack. About the Author Jesse Daniel is an experienced Python developer. He taught Python for Data Science at the University of Denver and leads a team of data scientists at a Denver-based media technology company. We interviewed Jesse as a part of our Six Questions series. Check it out here. Quotes The most comprehensive coverage of Dask to date, with real-world examples that made a difference in my daily work. - Al Krinker, United States Patent and Trademark Office An excellent alternative to PySpark for those who are not on a cloud platform. The author introduces Dask in a way that speaks directly to an analyst. - Jeremy Loscheider, Panera Bread A greatly paced introduction to Dask with real-world datasets. - George Thomas, R&D Architecture Manhattan Associates The ultimate resource to quickly get up and running with Dask and parallel processing in Python. - Gustavo Patino, Oakland University William Beaumont School of Medicine

Matplotlib for Python Developers - Second Edition

"Matplotlib for Python Developers" is your comprehensive guide to creating interactive and informative data visualizations using the Matplotlib library in Python. This book covers all the essentials-from building static plots to integrating dynamic graphics with web applications. What this Book will help me do Design and customize stunning data visualizations including heatmaps and scatter plots. Integrate Matplotlib visualization seamlessly into GUI applications using GTK3 or Qt. Utilize advanced plotting libraries like Seaborn and GeoPandas for enhanced visual representation. Develop web-based dashboards and plots that dynamically update using Django. Master techniques to prepare your Matplotlib projects for deployment in a cloud-based environment. Author(s) Authors Aldrin Yim, Claire Chung, and Allen Yu are seasoned developers and data scientists with extensive experience in Python and data visualization. They bring a practical touch to technical concepts, aiming to bridge theory with hands-on applications. With such a skilled team behind this book, you'll gain both foundational knowledge and advanced insights into Matplotlib. Who is it for? This book is the ideal resource for Python developers and data analysts looking to enhance their data visualization skills. If you're familiar with Python and want to create engaging, clear, and dynamic visualizations, this book will give you the tools to achieve that. Designed for a range of expertise, from beginners understanding the basics to experienced users diving into complex integrations, this book has something for everyone. You'll be guided through every step, ensuring you build the confidence and skills needed to thrive in this area.

Pandas for Everyone: Python Data Analysis, First Edition

The Hands-On, Example-Rich Introduction to Pandas Data Analysis in Python Today, analysts must manage data characterized by extraordinary variety, velocity, and volume. Using the open source Pandas library, you can use Python to rapidly automate and perform virtually any data analysis task, no matter how large or complex. Pandas can help you ensure the veracity of your data, visualize it for effective decision-making, and reliably reproduce analyses across multiple datasets. brings together practical knowledge and insight for solving real problems with Pandas, even if you’re new to Python data analysis. Daniel Y. Chen introduces key concepts through simple but practical examples, incrementally building on them to solve more difficult, real-world problems. Pandas for Everyone Chen gives you a jumpstart on using Pandas with a realistic dataset and covers combining datasets, handling missing data, and structuring datasets for easier analysis and visualization. He demonstrates powerful data cleaning techniques, from basic string manipulation to applying functions simultaneously across dataframes. Once your data is ready, Chen guides you through fitting models for prediction, clustering, inference, and exploration. He provides tips on performance and scalability, and introduces you to the wider Python data analysis ecosystem. Work with DataFrames and Series, and import or export data Create plots with matplotlib, seaborn, and pandas Combine datasets and handle missing data Reshape, tidy, and clean datasets so they’re easier to work with Convert data types and manipulate text strings Apply functions to scale data manipulations Aggregate, transform, and filter large datasets with groupby Leverage Pandas’ advanced date and time capabilities Fit linear models using statsmodels and scikit-learn libraries Use generalized linear modeling to fit models with different response variables Compare multiple models to select the “best” Regularize to overcome overfitting and improve performance Use clustering in unsupervised machine learning Register your product at informit.com/register for convenient access to downloads, updates, and/or corrections as they become available.

Pandas Cookbook

The Pandas Cookbook offers a collection of practical recipes for mastering data manipulation, analysis, and visualization tasks using pandas. Through a methodological and hands-on approach, you will learn to utilize pandas for handling real-world datasets efficiently. By the end of this book, you will be able to solve complex data science problems and create insightful visual representations in Python. What this Book will help me do Understand the core functionalities of pandas 0.20 for exploring datasets effectively. Master filtering, selecting, and transforming data for targeted analysis. Leverage pandas' features for aggregating and transforming grouped data. Restructure data for analysis and create professional visualizations using integration with Seaborn and Matplotlib. Gain expertise in handling time series data and SQL-like merging operations. Author(s) Theodore Petrou, the author of the Pandas Cookbook, is a data scientist and Python expert with extensive experience teaching and using pandas in professional settings. Known for his practical approach, he meticulously explains each recipe and includes comprehensive examples and datasets in Jupyter notebooks to enhance your learning experience. Who is it for? This book is aimed at data scientists, Python developers, and analysts seeking an in-depth, practical guide to mastering data analysis with pandas. Whether you're a beginner with some knowledge of Python or an experienced analyst looking to refine your skills, this cookbook provides valuable insights and techniques for your data-driven tasks.

Matplotlib 2.x By Example

"Matplotlib 2.x By Example" is your comprehensive guide to mastering data visualization in Python using the Matplotlib library. Through detailed explanations and hands-on examples, this book will teach you how to create stunning, insightful, and professional-looking visual representations of your data. You'll learn valuable skills tailored towards practical applications in science, marketing, and data analysis. What this Book will help me do Understand the core features of Matplotlib and how to use them effectively. Create professional 2D and 3D visualizations, such as scatter plots, line graphs, and more. Develop skills to transform raw data into meaningful insights through visualization. Enhance your data visualizations with interactive elements and animations. Leverage additional libraries such as Seaborn and Pandas to expand functionality. Author(s) Allen Yu, Claire Chung, and Aldrin Yim are seasoned data scientists and technical authors with extensive experience in Python and data visualization. Allen and his coauthors are dedicated to helping readers bridge the gap between their raw data and meaningful insights through visualization. With practical applications and real-world examples, their approachable writing makes complex libraries like Matplotlib accessible and production-ready. Who is it for? This book is perfect for data enthusiasts, analysts, and Python programmers looking to enhance their data visualization skills. Whether you're a professional aiming to create high-quality visual reports or a student eager to understand and present data effectively, this book provides practical and actionable insights. Basic Python knowledge is expected, while all Matplotlib-related aspects are thoroughly explained.