Search – talk-data.com

Title & Speakers	Event
Introduction to Data Analysis Using Pandas 2025-07-07 · 20:30 Stefanie Molin – author Working with data can be challenging: it often doesn’t come in the best format for analysis, and understanding it well enough to extract insights requires both time and the skills to filter, aggregate, reshape, and visualize it. This session will equip you with the knowledge you need to effectively use pandas – a powerful library for data analysis in Python – to make this process easier. Pandas makes it possible to work with tabular data and perform all parts of the analysis from collection and manipulation through aggregation and visualization. While most of this session focuses on pandas, during our discussion of visualization, we will also introduce at a high level Matplotlib (the library that pandas uses for its visualization features, which when used directly makes it possible to create custom layouts, add annotations, etc.) and Seaborn (another plotting library, which features additional plot types and the ability to visualize long-format data). Matplotlib Pandas Python Seaborn	SciPy 2025
Industry Roundup #2: AI Agents for Data Work, The Return of the Full-Stack Data Scientist and Old languages Make a Comeback 2024-12-06 · 11:00 Adel – host @ DataFramed , Richie – host @ DataCamp Welcome to DataFramed Industry Roundups! In this series of episodes, Adel & Richie sit down to discuss the latest and greatest in data & AI. In this episode, we touch upon AI agents for data work, will the full-stack data scientist make a return, old languages making a comeback, Python's increase in performance, what they're both thankful for, and much more. Links Mentioned in the Show Fractal’s Data Science Agent: AryaArticle: What Makes a True AI Agent? Rethinking the Pursuit of AutonomyCassie Kozyrkov on DataFramedTIOBE Index for November 2024Community discussion on FortranTutorial: High Performance Data Manipulation in Python: pandas 2.0 vs. polars New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business AI/ML Data Science Pandas Polars Python	DataFramed Listen
Pandas Cookbook - Third Edition 2024-10-31 William Ayd – author , Matthew Harrison – author Discover the power of pandas for your data analysis tasks. Pandas Cookbook provides practical, hands-on recipes for mastering pandas 2.x, guiding you through real-world scenarios quickly and effectively. What this Book will help me do Efficiently manipulate and clean data using pandas. Perform advanced grouping and aggregation operations. Handle time series data with pandas robust functions. Optimize pandas code for better performance. Integrate pandas with tools like NumPy and databases. Author(s) William Ayd and Matthew Harrison co-authored this insightful cookbook. With years of practical experience in data science and Python development, both authors aim to make data analysis accessible and efficient using pandas. Who is it for? This book is perfect for Python developers and data analysts looking to enhance their data manipulation skills. Whether you're a beginner aiming to understand pandas or a professional seeking advanced insights, this book is tailored for anyone handling structured data. data data-science data-science-tools Pandas Data Science NumPy Python	O'Reilly Data Science Books
[Online] Polars for Data Analysis in Python 2024-10-08 · 16:00 Discover Polars, the high-performance DataFrame library revolutionizing data analysis in Python. Built on Rust, Polars offers unparalleled speed and efficiency, outperforming pandas, Dask, and even PySpark. Explore its innovative features like lazy evaluation, memory efficiency, and automatic multi-threading, designed to handle large datasets with ease. In this session, you'll learn practical techniques for data manipulation and advanced transformations. We will demonstrate Polars' syntax and capabilities, making it accessible even if you’re new to Polars. Join us to elevate your Python data analysis to the next level. This presentation covers: Section 1: What is Polars and how does it compare to pandas? Section 2: Getting Started with Polars in Python Section 3: Advanced Data Analysis with Polars Section 4: Should you switch to Polars? ---------------------------------------- How to Join the Webinar ---------------------------------------- You can join via your browser (no app download required). Use Chrome or Firefox. Pre-register for the webinar: https://www.bigmarker.com/neo4j/Data-Umbrella-Webinar -------------------------------- Video Recording -------------------------------- This event will be recorded and placed on our YouTube. We usually have it up within 24 hours of the event. Subscribe to our YT and set your notifications: https://www.youtube.com/c/DataUmbrella/ ---------------------------------------- Time ---------------------------------------- 16:00 UTC, 9am PT / 12pm ET/ 7pm EAT/ 9:30pm IST ---------------------------------------- Additional Details ---------------------------------------- Talk Level: Intermediate Pre-reqs: Intermediate knowledge of Python and pandas Prep Work: None Resources: Polars documentation: https://docs.pola.rs/ ---------------------------------------- Connect with Data Umbrella ---------------------------------------- We invite you to follow Data Umbrella on our social networking sites to keep up to date on the latest news. Meetup Job Board Newsletter Twitter YouTube LinkedIn Instagram	[Online] Polars for Data Analysis in Python
Data Engineering for Machine Learning Pipelines: From Python Libraries to ML Pipelines and Cloud Platforms 2024-09-27 Pavan Kumar Narayanan – author This book covers modern data engineering functions and important Python libraries, to help you develop state-of-the-art ML pipelines and integration code. The book begins by explaining data analytics and transformation, delving into the Pandas library, its capabilities, and nuances. It then explores emerging libraries such as Polars and CuDF, providing insights into GPU-based computing and cutting-edge data manipulation techniques. The text discusses the importance of data validation in engineering processes, introducing tools such as Great Expectations and Pandera to ensure data quality and reliability. The book delves into API design and development, with a specific focus on leveraging the power of FastAPI. It covers authentication, authorization, and real-world applications, enabling you to construct efficient and secure APIs using FastAPI. Also explored is concurrency in data engineering, examining Dask's capabilities from basic setup to crafting advanced machine learning pipelines. The book includes development and delivery of data engineering pipelines using leading cloud platforms such as AWS, Google Cloud, and Microsoft Azure. The concluding chapters concentrate on real-time and streaming data engineering pipelines, emphasizing Apache Kafka and workflow orchestration in data engineering. Workflow tools such as Airflow and Prefect are introduced to seamlessly manage and automate complex data workflows. What sets this book apart is its blend of theoretical knowledge and practical application, a structured path from basic to advanced concepts, and insights into using state-of-the-art tools. With this book, you gain access to cutting-edge techniques and insights that are reshaping the industry. This book is not just an educational tool. It is a career catalyst, and an investment in your future as a data engineering expert, poised to meet the challenges of today's data-driven world. What You Will Learn Elevate your data wrangling jobs by utilizing the power of both CPU and GPU computing, and learn to process data using Pandas 2.0, Polars, and CuDF at unprecedented speeds Design data validation pipelines, construct efficient data service APIs, develop real-time streaming pipelines and master the art of workflow orchestration to streamline your engineering projects Leverage concurrent programming to develop machine learning pipelines and get hands-on experience in development and deployment of machine learning pipelines across AWS, GCP, and Azure Who This Book Is For Data analysts, data engineers, data scientists, machine learning engineers, and MLOps specialists data ai-ml machine-learning AI/ML Airflow Analytics API AWS Azure Cloud Computing Data Analytics Data Engineering Data Quality GCP Kafka Microsoft MLOps Pandas Polars Prefect Python Data Streaming	O'Reilly Data Engineering Books
Data Manipulation with Pandas 2024-08-31 · 17:00 Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended. Workshop Outline: Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries Agenda: (PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Webinar Passcode 953375	Data Manipulation with Pandas
Data Manipulation with Pandas 2024-08-31 · 17:00 Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended. Workshop Outline: Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries Agenda: (PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Webinar Passcode 953375	Data Manipulation with Pandas
Data Manipulation with Pandas 2024-08-24 · 17:00 Dr. Yasin Ceran – Associate Professor @ KAIST Pandas	Data Manipulation with Pandas
Data Manipulation with Pandas 2024-08-24 · 17:00 Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended. Workshop Outline: Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries Agenda: (PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Webinar Passcode 953375	Data Manipulation with Pandas
Polars Cookbook 2024-08-23 Yuki Kakegawa – author Dive into the world of data analysis with the Polars Cookbook. This book, ideal for data professionals, covers practical recipes to manipulate, transform, and analyze data using the Python Polars library. You'll learn both the fundamentals and advanced techniques to build efficient and scalable data workflows. What this Book will help me do Master the basics of Python Polars including installation and setup. Perform complex data manipulation like pivoting, grouping, and joining. Handle large-scale time series data for accurate analysis. Understand data integration with libraries like pandas and numpy. Optimize workflows for both on-premise and cloud environments. Author(s) Yuki Kakegawa is an experienced data analytics consultant who has collaborated with companies such as Microsoft and Stanford Health Care. His passion for data led him to create this detailed guide on Polars. His expertise ensures you gain real-world, actionable insights from every chapter. Who is it for? This book is perfect for data analysts, engineers, and scientists eager to enhance their efficiency with Python Polars. If you are familiar with Python and tools like pandas but are new to Polars, this book will upskill you. Whether handling big data or optimizing code for performance, the Polars Cookbook has the guidance you need to succeed. data data-science data-science-tools Pandas Analytics Big Data Cloud Computing Data Analytics Microsoft NumPy Polars Python	O'Reilly Data Science Books
Data Manipulation with Pandas 2024-08-17 · 17:00 Dr. Yasin Ceran – Associate Professor @ KAIST Session on data manipulation with pandas. Pandas Python	Data Manipulation with Pandas
Data Manipulation with Pandas 2024-08-17 · 17:00 Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended. Workshop Outline: Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries Agenda: (PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Webinar Passcode 953375	Data Manipulation with Pandas
Data Manipulation with Pandas 2024-08-10 · 17:00 Dr. Yasin Ceran – Associate Professor @ KAIST Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Pandas Python	Data Manipulation with Pandas
Data Manipulation with Pandas 2024-08-10 · 17:00 Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended. Workshop Outline: Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries Agenda: (PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Webinar Passcode 953375	Data Manipulation with Pandas
Data Manipulation with Pandas 2023-12-15 · 18:30 Dr. Yasin Ceran – Associate Professor @ KAIST Hands-on tutorial on data manipulation with pandas. Pandas	Data Manipulation with Pandas
Data Manipulation with Pandas 2023-12-15 · 18:30 Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/1316984993505/WN_i227Ph51SHu9znRh6BAKyg This workshop will be a hands-on tutorial for the python pandas library. pandas is one of the popular tools used for manipulating, cleaning, integration and wrangling of tabular data. Data scientists spend significant amount of their time on such operations. This workshop aims to introduce how pandas can be used in data analysis by working on real datasets. The workshop will be held using Jupyter-notebook program. One easy way of installing this program is through anaconda platform. https://www.anaconda.com/products/individual Agenda: (PST) 10:25 am - 10:30 am Arrival, socializing, and Opening (PST) 10:30 am - 12:20 pm Dr. Yasin Ceran, "Data Manipulation with Pandas" (PST) 12:20 pm - 12:30 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/1316984993505/WN_i227Ph51SHu9znRh6BAKyg Webinar Passcode 356741	Data Manipulation with Pandas
Data Manipulation with Pandas 2023-11-24 · 18:30 Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/1316984993505/WN_rmJ7nMIzQWK76evLIozZOg This workshop will be a hands-on tutorial for the python pandas library. pandas is one of the popular tools used for manipulating, cleaning, integration and wrangling of tabular data. Data scientists spend significant amount of their time on such operations. This workshop aims to introduce how pandas can be used in data analysis by working on real datasets. The workshop will be held using Jupyter-notebook program. One easy way of installing this program is through anaconda platform. https://www.anaconda.com/products/individual Agenda: (PST) 10:25 am - 10:30 am Arrival, socializing, and Opening (PST) 10:30 am - 12:20 pm Dr. Yasin Ceran, "Data Manipulation with Pandas" (PST) 12:20 pm - 12:30 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/1316984993505/WN_rmJ7nMIzQWK76evLIozZOg Webinar Passcode 356741	Data Manipulation with Pandas
Data Manipulation with Pandas 2023-11-24 · 18:30 Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/1316984993505/WN_rmJ7nMIzQWK76evLIozZOg This workshop will be a hands-on tutorial for the python pandas library. pandas is one of the popular tools used for manipulating, cleaning, integration and wrangling of tabular data. Data scientists spend significant amount of their time on such operations. This workshop aims to introduce how pandas can be used in data analysis by working on real datasets. The workshop will be held using Jupyter-notebook program. One easy way of installing this program is through anaconda platform. https://www.anaconda.com/products/individual Agenda: (PST) 10:25 am - 10:30 am Arrival, socializing, and Opening (PST) 10:30 am - 12:20 pm Dr. Yasin Ceran, "Data Manipulation with Pandas" (PST) 12:20 pm - 12:30 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/1316984993505/WN_rmJ7nMIzQWK76evLIozZOg Webinar Passcode 356741	Data Manipulation with Pandas
Python Data Analytics: With Pandas, NumPy, and Matplotlib 2023-09-01 Fabio Nelli – author Explore the latest Python tools and techniques to help you tackle the world of data acquisition and analysis. You'll review scientific computing with NumPy, visualization with matplotlib, and machine learning with scikit-learn. This third edition is fully updated for the latest version of Python and its related libraries, and includes coverage of social media data analysis, image analysis with OpenCV, and deep learning libraries. Each chapter includes multiple examples demonstrating how to work with each library. At its heart lies the coverage of pandas, for high-performance, easy-to-use data structures and tools for data manipulation Author Fabio Nelli expertly demonstrates using Python for data processing, management, and information retrieval. Later chapters apply what you've learned to handwriting recognition and extending graphical capabilities with the JavaScript D3 library. Whether you are dealing with sales data, investment data, medical data, web page usage, or other data sets, Python Data Analytics, Third Edition is an invaluable reference with its examples of storing, accessing, and analyzing data. What You'll Learn Understand the core concepts of data analysis and the Python ecosystem Go in depth with pandas for reading, writing, and processing data Use tools and techniques for data visualization and image analysis Examine popular deep learning libraries Keras, Theano,TensorFlow, and PyTorch Who This Book Is For Experienced Python developers who need to learn about Pythonic tools for data analysis data data-science data-science-tools Pandas AI/ML Analytics Data Analytics DataViz JavaScript Keras Matplotlib NumPy Python PyTorch Scikit-learn TensorFlow	O'Reilly Data Science Books
Mastering Data Manipulation with Pandas 2023-08-15 · 18:50 dr sefer baday – Assistant Professor @ Informatics Institute of Istanbul Technical University A hands-on tutorial for the Python pandas library covering data manipulation, cleaning, integration, and wrangling of tabular data. Pandas Python jupyter notebook	Mastering Data Manipulation with Pandas

Introduction to Data Analysis Using Pandas 2025-07-07 · 20:30

Stefanie Molin – author

Working with data can be challenging: it often doesn’t come in the best format for analysis, and understanding it well enough to extract insights requires both time and the skills to filter, aggregate, reshape, and visualize it. This session will equip you with the knowledge you need to effectively use pandas – a powerful library for data analysis in Python – to make this process easier.

Pandas makes it possible to work with tabular data and perform all parts of the analysis from collection and manipulation through aggregation and visualization. While most of this session focuses on pandas, during our discussion of visualization, we will also introduce at a high level Matplotlib (the library that pandas uses for its visualization features, which when used directly makes it possible to create custom layouts, add annotations, etc.) and Seaborn (another plotting library, which features additional plot types and the ability to visualize long-format data).

Matplotlib Pandas Python Seaborn

SciPy 2025

Industry Roundup #2: AI Agents for Data Work, The Return of the Full-Stack Data Scientist and Old languages Make a Comeback 2024-12-06 · 11:00

Adel – host @ DataFramed , Richie – host @ DataCamp

Welcome to DataFramed Industry Roundups! In this series of episodes, Adel & Richie sit down to discuss the latest and greatest in data & AI. In this episode, we touch upon AI agents for data work, will the full-stack data scientist make a return, old languages making a comeback, Python's increase in performance, what they're both thankful for, and much more. Links Mentioned in the Show Fractal’s Data Science Agent: AryaArticle: What Makes a True AI Agent? Rethinking the Pursuit of AutonomyCassie Kozyrkov on DataFramedTIOBE Index for November 2024Community discussion on FortranTutorial: High Performance Data Manipulation in Python: pandas 2.0 vs. polars New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

AI/ML Data Science Pandas Polars Python

DataFramed

Listen

Pandas Cookbook - Third Edition 2024-10-31

William Ayd – author , Matthew Harrison – author

Discover the power of pandas for your data analysis tasks. Pandas Cookbook provides practical, hands-on recipes for mastering pandas 2.x, guiding you through real-world scenarios quickly and effectively. What this Book will help me do Efficiently manipulate and clean data using pandas. Perform advanced grouping and aggregation operations. Handle time series data with pandas robust functions. Optimize pandas code for better performance. Integrate pandas with tools like NumPy and databases. Author(s) William Ayd and Matthew Harrison co-authored this insightful cookbook. With years of practical experience in data science and Python development, both authors aim to make data analysis accessible and efficient using pandas. Who is it for? This book is perfect for Python developers and data analysts looking to enhance their data manipulation skills. Whether you're a beginner aiming to understand pandas or a professional seeking advanced insights, this book is tailored for anyone handling structured data.

data data-science data-science-tools Pandas Data Science NumPy Python

O'Reilly Data Science Books

[Online] Polars for Data Analysis in Python 2024-10-08 · 16:00

Discover Polars, the high-performance DataFrame library revolutionizing data analysis in Python. Built on Rust, Polars offers unparalleled speed and efficiency, outperforming pandas, Dask, and even PySpark. Explore its innovative features like lazy evaluation, memory efficiency, and automatic multi-threading, designed to handle large datasets with ease.

In this session, you'll learn practical techniques for data manipulation and advanced transformations. We will demonstrate Polars' syntax and capabilities, making it accessible even if you’re new to Polars. Join us to elevate your Python data analysis to the next level.

This presentation covers:

Section 1: What is Polars and how does it compare to pandas?
Section 2: Getting Started with Polars in Python
Section 3: Advanced Data Analysis with Polars
Section 4: Should you switch to Polars?

---------------------------------------- How to Join the Webinar ---------------------------------------- You can join via your browser (no app download required). Use Chrome or Firefox. Pre-register for the webinar: https://www.bigmarker.com/neo4j/Data-Umbrella-Webinar

-------------------------------- Video Recording -------------------------------- This event will be recorded and placed on our YouTube. We usually have it up within 24 hours of the event. Subscribe to our YT and set your notifications: https://www.youtube.com/c/DataUmbrella/

---------------------------------------- Time ---------------------------------------- 16:00 UTC, 9am PT / 12pm ET/ 7pm EAT/ 9:30pm IST

---------------------------------------- Additional Details ---------------------------------------- Talk Level: Intermediate Pre-reqs: Intermediate knowledge of Python and pandas Prep Work: None Resources: Polars documentation: https://docs.pola.rs/ ---------------------------------------- Connect with Data Umbrella ---------------------------------------- We invite you to follow Data Umbrella on our social networking sites to keep up to date on the latest news.

[Online] Polars for Data Analysis in Python

Data Engineering for Machine Learning Pipelines: From Python Libraries to ML Pipelines and Cloud Platforms 2024-09-27

Pavan Kumar Narayanan – author

This book covers modern data engineering functions and important Python libraries, to help you develop state-of-the-art ML pipelines and integration code. The book begins by explaining data analytics and transformation, delving into the Pandas library, its capabilities, and nuances. It then explores emerging libraries such as Polars and CuDF, providing insights into GPU-based computing and cutting-edge data manipulation techniques. The text discusses the importance of data validation in engineering processes, introducing tools such as Great Expectations and Pandera to ensure data quality and reliability. The book delves into API design and development, with a specific focus on leveraging the power of FastAPI. It covers authentication, authorization, and real-world applications, enabling you to construct efficient and secure APIs using FastAPI. Also explored is concurrency in data engineering, examining Dask's capabilities from basic setup to crafting advanced machine learning pipelines. The book includes development and delivery of data engineering pipelines using leading cloud platforms such as AWS, Google Cloud, and Microsoft Azure. The concluding chapters concentrate on real-time and streaming data engineering pipelines, emphasizing Apache Kafka and workflow orchestration in data engineering. Workflow tools such as Airflow and Prefect are introduced to seamlessly manage and automate complex data workflows. What sets this book apart is its blend of theoretical knowledge and practical application, a structured path from basic to advanced concepts, and insights into using state-of-the-art tools. With this book, you gain access to cutting-edge techniques and insights that are reshaping the industry. This book is not just an educational tool. It is a career catalyst, and an investment in your future as a data engineering expert, poised to meet the challenges of today's data-driven world. What You Will Learn Elevate your data wrangling jobs by utilizing the power of both CPU and GPU computing, and learn to process data using Pandas 2.0, Polars, and CuDF at unprecedented speeds Design data validation pipelines, construct efficient data service APIs, develop real-time streaming pipelines and master the art of workflow orchestration to streamline your engineering projects Leverage concurrent programming to develop machine learning pipelines and get hands-on experience in development and deployment of machine learning pipelines across AWS, GCP, and Azure Who This Book Is For Data analysts, data engineers, data scientists, machine learning engineers, and MLOps specialists

data ai-ml machine-learning AI/ML Airflow Analytics API AWS Azure Cloud Computing Data Analytics Data Engineering Data Quality GCP Kafka Microsoft MLOps Pandas Polars Prefect Python Data Streaming

O'Reilly Data Engineering Books

Data Manipulation with Pandas 2024-08-31 · 17:00

Please register using the zoom link to get a reminder:

https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw

Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended.

Workshop Outline:

Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries

Agenda:

(PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A

About Dr. Yasin Ceran:

Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners.

https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw

Webinar Passcode 953375

Data Manipulation with Pandas

Data Manipulation with Pandas 2024-08-31 · 17:00

Please register using the zoom link to get a reminder:

https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw

Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended.

Workshop Outline:

Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries

Agenda:

(PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A

About Dr. Yasin Ceran:

Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners.

https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw

Webinar Passcode 953375

Data Manipulation with Pandas

Data Manipulation with Pandas 2024-08-24 · 17:00

Dr. Yasin Ceran – Associate Professor @ KAIST

Pandas

Data Manipulation with Pandas

Data Manipulation with Pandas 2024-08-24 · 17:00

Please register using the zoom link to get a reminder:

https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw

Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended.

Workshop Outline:

Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries

Agenda:

(PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A

About Dr. Yasin Ceran:

Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners.

https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw

Webinar Passcode 953375

Data Manipulation with Pandas

Polars Cookbook 2024-08-23

Yuki Kakegawa – author

Dive into the world of data analysis with the Polars Cookbook. This book, ideal for data professionals, covers practical recipes to manipulate, transform, and analyze data using the Python Polars library. You'll learn both the fundamentals and advanced techniques to build efficient and scalable data workflows. What this Book will help me do Master the basics of Python Polars including installation and setup. Perform complex data manipulation like pivoting, grouping, and joining. Handle large-scale time series data for accurate analysis. Understand data integration with libraries like pandas and numpy. Optimize workflows for both on-premise and cloud environments. Author(s) Yuki Kakegawa is an experienced data analytics consultant who has collaborated with companies such as Microsoft and Stanford Health Care. His passion for data led him to create this detailed guide on Polars. His expertise ensures you gain real-world, actionable insights from every chapter. Who is it for? This book is perfect for data analysts, engineers, and scientists eager to enhance their efficiency with Python Polars. If you are familiar with Python and tools like pandas but are new to Polars, this book will upskill you. Whether handling big data or optimizing code for performance, the Polars Cookbook has the guidance you need to succeed.

data data-science data-science-tools Pandas Analytics Big Data Cloud Computing Data Analytics Microsoft NumPy Polars Python

O'Reilly Data Science Books

Data Manipulation with Pandas 2024-08-17 · 17:00

Dr. Yasin Ceran – Associate Professor @ KAIST

Session on data manipulation with pandas.

Pandas Python

Data Manipulation with Pandas

Data Manipulation with Pandas 2024-08-17 · 17:00

Please register using the zoom link to get a reminder:

https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw

Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended.

Workshop Outline:

Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries

Agenda:

(PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A

About Dr. Yasin Ceran:

Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners.

https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw

Webinar Passcode 953375

Data Manipulation with Pandas

Data Manipulation with Pandas 2024-08-10 · 17:00

Dr. Yasin Ceran – Associate Professor @ KAIST

Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering.

Pandas Python

Data Manipulation with Pandas

Data Manipulation with Pandas 2024-08-10 · 17:00

Please register using the zoom link to get a reminder:

https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw

Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended.

Workshop Outline:

Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries

Agenda:

(PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A

About Dr. Yasin Ceran:

Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners.

https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw

Webinar Passcode 953375

Data Manipulation with Pandas

Data Manipulation with Pandas 2023-12-15 · 18:30

Dr. Yasin Ceran – Associate Professor @ KAIST

Hands-on tutorial on data manipulation with pandas.

Pandas

Data Manipulation with Pandas

Data Manipulation with Pandas 2023-12-15 · 18:30

Please register using the zoom link to get a reminder:

https://us02web.zoom.us/webinar/register/1316984993505/WN_i227Ph51SHu9znRh6BAKyg

This workshop will be a hands-on tutorial for the python pandas library. pandas is one of the popular tools used for manipulating, cleaning, integration and wrangling of tabular data. Data scientists spend significant amount of their time on such operations. This workshop aims to introduce how pandas can be used in data analysis by working on real datasets.

The workshop will be held using Jupyter-notebook program. One easy way of installing this program is through anaconda platform.

https://www.anaconda.com/products/individual

Agenda:

(PST) 10:25 am - 10:30 am Arrival, socializing, and Opening (PST) 10:30 am - 12:20 pm Dr. Yasin Ceran, "Data Manipulation with Pandas" (PST) 12:20 pm - 12:30 pm Q&A

About Dr. Yasin Ceran:

Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners.

https://us02web.zoom.us/webinar/register/1316984993505/WN_i227Ph51SHu9znRh6BAKyg

Webinar Passcode 356741

Data Manipulation with Pandas

Data Manipulation with Pandas 2023-11-24 · 18:30

Please register using the zoom link to get a reminder:

https://us02web.zoom.us/webinar/register/1316984993505/WN_rmJ7nMIzQWK76evLIozZOg

This workshop will be a hands-on tutorial for the python pandas library. pandas is one of the popular tools used for manipulating, cleaning, integration and wrangling of tabular data. Data scientists spend significant amount of their time on such operations. This workshop aims to introduce how pandas can be used in data analysis by working on real datasets.

The workshop will be held using Jupyter-notebook program. One easy way of installing this program is through anaconda platform.

https://www.anaconda.com/products/individual

Agenda:

(PST) 10:25 am - 10:30 am Arrival, socializing, and Opening (PST) 10:30 am - 12:20 pm Dr. Yasin Ceran, "Data Manipulation with Pandas" (PST) 12:20 pm - 12:30 pm Q&A

About Dr. Yasin Ceran:

Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners.

https://us02web.zoom.us/webinar/register/1316984993505/WN_rmJ7nMIzQWK76evLIozZOg

Webinar Passcode 356741

Data Manipulation with Pandas

Data Manipulation with Pandas 2023-11-24 · 18:30

Please register using the zoom link to get a reminder:

https://us02web.zoom.us/webinar/register/1316984993505/WN_rmJ7nMIzQWK76evLIozZOg

This workshop will be a hands-on tutorial for the python pandas library. pandas is one of the popular tools used for manipulating, cleaning, integration and wrangling of tabular data. Data scientists spend significant amount of their time on such operations. This workshop aims to introduce how pandas can be used in data analysis by working on real datasets.

The workshop will be held using Jupyter-notebook program. One easy way of installing this program is through anaconda platform.

https://www.anaconda.com/products/individual

Agenda:

(PST) 10:25 am - 10:30 am Arrival, socializing, and Opening (PST) 10:30 am - 12:20 pm Dr. Yasin Ceran, "Data Manipulation with Pandas" (PST) 12:20 pm - 12:30 pm Q&A

About Dr. Yasin Ceran:

Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners.

https://us02web.zoom.us/webinar/register/1316984993505/WN_rmJ7nMIzQWK76evLIozZOg

Webinar Passcode 356741

Data Manipulation with Pandas

Python Data Analytics: With Pandas, NumPy, and Matplotlib 2023-09-01

Fabio Nelli – author

Explore the latest Python tools and techniques to help you tackle the world of data acquisition and analysis. You'll review scientific computing with NumPy, visualization with matplotlib, and machine learning with scikit-learn. This third edition is fully updated for the latest version of Python and its related libraries, and includes coverage of social media data analysis, image analysis with OpenCV, and deep learning libraries. Each chapter includes multiple examples demonstrating how to work with each library. At its heart lies the coverage of pandas, for high-performance, easy-to-use data structures and tools for data manipulation Author Fabio Nelli expertly demonstrates using Python for data processing, management, and information retrieval. Later chapters apply what you've learned to handwriting recognition and extending graphical capabilities with the JavaScript D3 library. Whether you are dealing with sales data, investment data, medical data, web page usage, or other data sets, Python Data Analytics, Third Edition is an invaluable reference with its examples of storing, accessing, and analyzing data. What You'll Learn Understand the core concepts of data analysis and the Python ecosystem Go in depth with pandas for reading, writing, and processing data Use tools and techniques for data visualization and image analysis Examine popular deep learning libraries Keras, Theano,TensorFlow, and PyTorch Who This Book Is For Experienced Python developers who need to learn about Pythonic tools for data analysis

data data-science data-science-tools Pandas AI/ML Analytics Data Analytics DataViz JavaScript Keras Matplotlib NumPy Python PyTorch Scikit-learn TensorFlow

O'Reilly Data Science Books

Mastering Data Manipulation with Pandas 2023-08-15 · 18:50

dr sefer baday – Assistant Professor @ Informatics Institute of Istanbul Technical University

A hands-on tutorial for the Python pandas library covering data manipulation, cleaning, integration, and wrangling of tabular data.

Pandas Python jupyter notebook

Mastering Data Manipulation with Pandas

Activities & events