talk-data.com talk-data.com

Marysia Winkels

Speaker

Marysia Winkels

2

talks

guest PyData

Filter by Event / Source

Talks & appearances

2 activities · Newest first

Search activities →
Help! There Are Humans in My Data!

Good quality data is the basis for high quality models and valuable data insights. But isn't it annoying how often your data is riddled with those pesky humans? Human involvement in data creation often introduces errors, misunderstandings, and biases that can compromise data integrity. This talk will explore how human factors influence the data creation process and what we as data professionals can do to account for this in our data interpretation and usage.

We talked about:

Marysia’s background What data-centric AI is Data-centric Kaggle competitions The mindset shift to data-centric AI Data-centric does not mean you should not iterate on models How to implement the data-centric approach Focusing on the data vs focusing on the model Resources to help implement the data-centric approach Data-centric AI vs standard data cleaning Making sure your data is representative Knowing when your data is good enough The importance of user feedback “Shadow Mode” deployment What to do if you have a lot of bad data or incomplete data Marysia’s role at PyData How Marysia joined PyData The difference between PyData and PyCon Finding Marysia online

Links:

Embetter & Bulk Demo: https://www.youtube.com/watch?v=L---nvDw9KU

Free data engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html