talk-data.com
Activities & events
| Title & Speakers | Event |
|---|---|
|
Introduction to Data Analysis Using Pandas
2025-07-07 · 20:30
Stefanie Molin
– author
Working with data can be challenging: it often doesn’t come in the best format for analysis, and understanding it well enough to extract insights requires both time and the skills to filter, aggregate, reshape, and visualize it. This session will equip you with the knowledge you need to effectively use pandas – a powerful library for data analysis in Python – to make this process easier. Pandas makes it possible to work with tabular data and perform all parts of the analysis from collection and manipulation through aggregation and visualization. While most of this session focuses on pandas, during our discussion of visualization, we will also introduce at a high level Matplotlib (the library that pandas uses for its visualization features, which when used directly makes it possible to create custom layouts, add annotations, etc.) and Seaborn (another plotting library, which features additional plot types and the ability to visualize long-format data). |
SciPy 2025
|
|
Industry Roundup #2: AI Agents for Data Work, The Return of the Full-Stack Data Scientist and Old languages Make a Comeback
2024-12-06 · 11:00
Welcome to DataFramed Industry Roundups! In this series of episodes, Adel & Richie sit down to discuss the latest and greatest in data & AI. In this episode, we touch upon AI agents for data work, will the full-stack data scientist make a return, old languages making a comeback, Python's increase in performance, what they're both thankful for, and much more. Links Mentioned in the Show Fractal’s Data Science Agent: AryaArticle: What Makes a True AI Agent? Rethinking the Pursuit of AutonomyCassie Kozyrkov on DataFramedTIOBE Index for November 2024Community discussion on FortranTutorial: High Performance Data Manipulation in Python: pandas 2.0 vs. polars New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business |
DataFramed |
|
Pandas Cookbook - Third Edition
2024-10-31
William Ayd
– author
,
Matthew Harrison
– author
Discover the power of pandas for your data analysis tasks. Pandas Cookbook provides practical, hands-on recipes for mastering pandas 2.x, guiding you through real-world scenarios quickly and effectively. What this Book will help me do Efficiently manipulate and clean data using pandas. Perform advanced grouping and aggregation operations. Handle time series data with pandas robust functions. Optimize pandas code for better performance. Integrate pandas with tools like NumPy and databases. Author(s) William Ayd and Matthew Harrison co-authored this insightful cookbook. With years of practical experience in data science and Python development, both authors aim to make data analysis accessible and efficient using pandas. Who is it for? This book is perfect for Python developers and data analysts looking to enhance their data manipulation skills. Whether you're a beginner aiming to understand pandas or a professional seeking advanced insights, this book is tailored for anyone handling structured data. |
O'Reilly Data Science Books
|
|
[Online] Polars for Data Analysis in Python
2024-10-08 · 16:00
Discover Polars, the high-performance DataFrame library revolutionizing data analysis in Python. Built on Rust, Polars offers unparalleled speed and efficiency, outperforming pandas, Dask, and even PySpark. Explore its innovative features like lazy evaluation, memory efficiency, and automatic multi-threading, designed to handle large datasets with ease. In this session, you'll learn practical techniques for data manipulation and advanced transformations. We will demonstrate Polars' syntax and capabilities, making it accessible even if you’re new to Polars. Join us to elevate your Python data analysis to the next level. This presentation covers:
---------------------------------------- How to Join the Webinar ---------------------------------------- You can join via your browser (no app download required). Use Chrome or Firefox. Pre-register for the webinar: https://www.bigmarker.com/neo4j/Data-Umbrella-Webinar -------------------------------- Video Recording -------------------------------- This event will be recorded and placed on our YouTube. We usually have it up within 24 hours of the event. Subscribe to our YT and set your notifications: https://www.youtube.com/c/DataUmbrella/ ---------------------------------------- Time ---------------------------------------- 16:00 UTC, 9am PT / 12pm ET/ 7pm EAT/ 9:30pm IST ---------------------------------------- Additional Details ---------------------------------------- Talk Level: Intermediate Pre-reqs: Intermediate knowledge of Python and pandas Prep Work: None Resources: Polars documentation: https://docs.pola.rs/ ---------------------------------------- Connect with Data Umbrella ---------------------------------------- We invite you to follow Data Umbrella on our social networking sites to keep up to date on the latest news. |
[Online] Polars for Data Analysis in Python
|
|
Pavan Kumar Narayanan
– author
This book covers modern data engineering functions and important Python libraries, to help you develop state-of-the-art ML pipelines and integration code. The book begins by explaining data analytics and transformation, delving into the Pandas library, its capabilities, and nuances. It then explores emerging libraries such as Polars and CuDF, providing insights into GPU-based computing and cutting-edge data manipulation techniques. The text discusses the importance of data validation in engineering processes, introducing tools such as Great Expectations and Pandera to ensure data quality and reliability. The book delves into API design and development, with a specific focus on leveraging the power of FastAPI. It covers authentication, authorization, and real-world applications, enabling you to construct efficient and secure APIs using FastAPI. Also explored is concurrency in data engineering, examining Dask's capabilities from basic setup to crafting advanced machine learning pipelines. The book includes development and delivery of data engineering pipelines using leading cloud platforms such as AWS, Google Cloud, and Microsoft Azure. The concluding chapters concentrate on real-time and streaming data engineering pipelines, emphasizing Apache Kafka and workflow orchestration in data engineering. Workflow tools such as Airflow and Prefect are introduced to seamlessly manage and automate complex data workflows. What sets this book apart is its blend of theoretical knowledge and practical application, a structured path from basic to advanced concepts, and insights into using state-of-the-art tools. With this book, you gain access to cutting-edge techniques and insights that are reshaping the industry. This book is not just an educational tool. It is a career catalyst, and an investment in your future as a data engineering expert, poised to meet the challenges of today's data-driven world. What You Will Learn Elevate your data wrangling jobs by utilizing the power of both CPU and GPU computing, and learn to process data using Pandas 2.0, Polars, and CuDF at unprecedented speeds Design data validation pipelines, construct efficient data service APIs, develop real-time streaming pipelines and master the art of workflow orchestration to streamline your engineering projects Leverage concurrent programming to develop machine learning pipelines and get hands-on experience in development and deployment of machine learning pipelines across AWS, GCP, and Azure Who This Book Is For Data analysts, data engineers, data scientists, machine learning engineers, and MLOps specialists |
O'Reilly Data Engineering Books
|
|
Data Manipulation with Pandas
2024-08-31 · 17:00
Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended. Workshop Outline: Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries Agenda: (PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Webinar Passcode 953375 |
Data Manipulation with Pandas
|
|
Data Manipulation with Pandas
2024-08-31 · 17:00
Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended. Workshop Outline: Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries Agenda: (PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Webinar Passcode 953375 |
Data Manipulation with Pandas
|
|
Data Manipulation with Pandas
2024-08-24 · 17:00
Dr. Yasin Ceran
– Associate Professor
@ KAIST
Pandas
|
Data Manipulation with Pandas
|
|
Data Manipulation with Pandas
2024-08-24 · 17:00
Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended. Workshop Outline: Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries Agenda: (PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Webinar Passcode 953375 |
Data Manipulation with Pandas
|
|
Polars Cookbook
2024-08-23
Yuki Kakegawa
– author
Dive into the world of data analysis with the Polars Cookbook. This book, ideal for data professionals, covers practical recipes to manipulate, transform, and analyze data using the Python Polars library. You'll learn both the fundamentals and advanced techniques to build efficient and scalable data workflows. What this Book will help me do Master the basics of Python Polars including installation and setup. Perform complex data manipulation like pivoting, grouping, and joining. Handle large-scale time series data for accurate analysis. Understand data integration with libraries like pandas and numpy. Optimize workflows for both on-premise and cloud environments. Author(s) Yuki Kakegawa is an experienced data analytics consultant who has collaborated with companies such as Microsoft and Stanford Health Care. His passion for data led him to create this detailed guide on Polars. His expertise ensures you gain real-world, actionable insights from every chapter. Who is it for? This book is perfect for data analysts, engineers, and scientists eager to enhance their efficiency with Python Polars. If you are familiar with Python and tools like pandas but are new to Polars, this book will upskill you. Whether handling big data or optimizing code for performance, the Polars Cookbook has the guidance you need to succeed. |
O'Reilly Data Science Books
|
|
Data Manipulation with Pandas
2024-08-17 · 17:00
Dr. Yasin Ceran
– Associate Professor
@ KAIST
Session on data manipulation with pandas. |
Data Manipulation with Pandas
|
|
Data Manipulation with Pandas
2024-08-17 · 17:00
Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended. Workshop Outline: Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries Agenda: (PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Webinar Passcode 953375 |
Data Manipulation with Pandas
|
|
Data Manipulation with Pandas
2024-08-10 · 17:00
Dr. Yasin Ceran
– Associate Professor
@ KAIST
Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. |
Data Manipulation with Pandas
|
|
Data Manipulation with Pandas
2024-08-10 · 17:00
Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Join us for a hands-on, two-week workshop to master data manipulation with Python's pandas library. This workshop is perfect for students, data enthusiasts, and professionals looking to enhance their data analysis skills. Basic knowledge of Python is recommended. Workshop Outline: Week 1: Introduction and Basic Data Manipulation Session 1: Pandas basics, Series, and DataFrame structures, loading/saving data, data selection, and filtering. Session 2: Handling missing data, data transformation, managing duplicates, and combining DataFrames. Week 2: Advanced Techniques and Visualization Session 3: Grouping and aggregation, pivot tables, cross-tabulation, and working with time series data. Session 4: Data visualization with pandas and other libraries Agenda: (PDT) 10:00 am - 10:05 am Arrival, socializing, and Opening (PDT) 10:05 am - 11:50 am Dr. Yasin Ceran, "Data Manipulation with Pandas" (PDT) 11:50 am - 12:00 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/7817214087648/WN_wWxCYMFhSJmpWy-hTzC2xw Webinar Passcode 953375 |
Data Manipulation with Pandas
|
|
Data Manipulation with Pandas
2023-12-15 · 18:30
Dr. Yasin Ceran
– Associate Professor
@ KAIST
Hands-on tutorial on data manipulation with pandas. |
Data Manipulation with Pandas
|
|
Data Manipulation with Pandas
2023-12-15 · 18:30
Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/1316984993505/WN_i227Ph51SHu9znRh6BAKyg This workshop will be a hands-on tutorial for the python pandas library. pandas is one of the popular tools used for manipulating, cleaning, integration and wrangling of tabular data. Data scientists spend significant amount of their time on such operations. This workshop aims to introduce how pandas can be used in data analysis by working on real datasets. The workshop will be held using Jupyter-notebook program. One easy way of installing this program is through anaconda platform. https://www.anaconda.com/products/individual Agenda: (PST) 10:25 am - 10:30 am Arrival, socializing, and Opening (PST) 10:30 am - 12:20 pm Dr. Yasin Ceran, "Data Manipulation with Pandas" (PST) 12:20 pm - 12:30 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/1316984993505/WN_i227Ph51SHu9znRh6BAKyg Webinar Passcode 356741 |
Data Manipulation with Pandas
|
|
Data Manipulation with Pandas
2023-11-24 · 18:30
Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/1316984993505/WN_rmJ7nMIzQWK76evLIozZOg This workshop will be a hands-on tutorial for the python pandas library. pandas is one of the popular tools used for manipulating, cleaning, integration and wrangling of tabular data. Data scientists spend significant amount of their time on such operations. This workshop aims to introduce how pandas can be used in data analysis by working on real datasets. The workshop will be held using Jupyter-notebook program. One easy way of installing this program is through anaconda platform. https://www.anaconda.com/products/individual Agenda: (PST) 10:25 am - 10:30 am Arrival, socializing, and Opening (PST) 10:30 am - 12:20 pm Dr. Yasin Ceran, "Data Manipulation with Pandas" (PST) 12:20 pm - 12:30 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/1316984993505/WN_rmJ7nMIzQWK76evLIozZOg Webinar Passcode 356741 |
Data Manipulation with Pandas
|
|
Data Manipulation with Pandas
2023-11-24 · 18:30
Please register using the zoom link to get a reminder: https://us02web.zoom.us/webinar/register/1316984993505/WN_rmJ7nMIzQWK76evLIozZOg This workshop will be a hands-on tutorial for the python pandas library. pandas is one of the popular tools used for manipulating, cleaning, integration and wrangling of tabular data. Data scientists spend significant amount of their time on such operations. This workshop aims to introduce how pandas can be used in data analysis by working on real datasets. The workshop will be held using Jupyter-notebook program. One easy way of installing this program is through anaconda platform. https://www.anaconda.com/products/individual Agenda: (PST) 10:25 am - 10:30 am Arrival, socializing, and Opening (PST) 10:30 am - 12:20 pm Dr. Yasin Ceran, "Data Manipulation with Pandas" (PST) 12:20 pm - 12:30 pm Q&A About Dr. Yasin Ceran: Yasin Ceran is passionate about all things data and holds a vast experience in data analysis, mathematical modeling and Apache Spark, and in SQL, Python and R. He is currently an associate professor at KAIST, South Korea, as well as teaching at San Jose State University at the heart of Silicon Valley. Yasin has worked rigorously on an array of data-related projects encompassing data mining, statistics, modeling, and is dedicated to sharing his experience and expertise with learners. https://us02web.zoom.us/webinar/register/1316984993505/WN_rmJ7nMIzQWK76evLIozZOg Webinar Passcode 356741 |
Data Manipulation with Pandas
|
|
Fabio Nelli
– author
Explore the latest Python tools and techniques to help you tackle the world of data acquisition and analysis. You'll review scientific computing with NumPy, visualization with matplotlib, and machine learning with scikit-learn. This third edition is fully updated for the latest version of Python and its related libraries, and includes coverage of social media data analysis, image analysis with OpenCV, and deep learning libraries. Each chapter includes multiple examples demonstrating how to work with each library. At its heart lies the coverage of pandas, for high-performance, easy-to-use data structures and tools for data manipulation Author Fabio Nelli expertly demonstrates using Python for data processing, management, and information retrieval. Later chapters apply what you've learned to handwriting recognition and extending graphical capabilities with the JavaScript D3 library. Whether you are dealing with sales data, investment data, medical data, web page usage, or other data sets, Python Data Analytics, Third Edition is an invaluable reference with its examples of storing, accessing, and analyzing data. What You'll Learn Understand the core concepts of data analysis and the Python ecosystem Go in depth with pandas for reading, writing, and processing data Use tools and techniques for data visualization and image analysis Examine popular deep learning libraries Keras, Theano,TensorFlow, and PyTorch Who This Book Is For Experienced Python developers who need to learn about Pythonic tools for data analysis |
O'Reilly Data Science Books
|
|
Mastering Data Manipulation with Pandas
2023-08-15 · 18:50
dr sefer baday
– Assistant Professor
@ Informatics Institute of Istanbul Technical University
A hands-on tutorial for the Python pandas library covering data manipulation, cleaning, integration, and wrangling of tabular data. |
Mastering Data Manipulation with Pandas
|