talk-data.com talk-data.com

Event

DataTalks.Club

2020-11-21 – 2025-11-28 Podcasts Visit website ↗

Activities tracked

104

DataTalks.Club - the place to talk about data!

Filtering by: GitHub ×

Sessions & talks

Showing 26–50 of 104 · Newest first

Search within this event →

Working in Open Source - Probabl.ai and sklearn - Vincent Warmerdam

2024-05-03 Listen
podcast_episode

We talked about:

Vincent’s Background SciKit Learn’s History and Company Formation Maintaining and Transitioning Open Source Projects Teaching and Learning Through Open Source Role of Developer Relations and Content Creation Teaching Through Calm Code and The Importance of Content Creation Current Projects and Future Plans for Calm Code Data Processing Tricks and The Importance of Innovation Learning the Fundamentals and Changing the Way You See a Problem Dev Rel and Core Dev in One Why :probabl. Needs a Dev Rel Exploration of Skrub and Advanced Data Processing Personal Insights on SciKit Learn and Industry Trends Vincent’s Upcoming Projects

Links:

probabl. YouTube channel: https://www.youtube.com/@UCIat2Cdg661wF5DQDWTQAmg Calmcode website: https://calmcode.io/ probabl. website: https://probabl.ai/

Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

AI for Ecology, Biodiversity, and Conservation - Tanya Berger-Wolf

2024-04-26 Listen
podcast_episode

Links:

Biodiversity and Artificial Intelligence pdf: https://www.gpai.ai/projects/responsible-ai/environment/biodiversity-and-AI-opportunities-recommendations-for-action.pdf

Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Knowledge Graphs and LLMs Across Academia and Industry - Anahita Pakiman

2024-04-05 Listen
podcast_episode

We talked about:

Anahita's Background Mechanical Engineering and Applied Mechanics Finite Element Analysis vs. Machine Learning Optimization and Semantic Reporting Application of Knowledge Graphs in Research Graphs vs Tabular Data Computational graphs Graph Data Science and Graph Machine Learning Combining Knowledge Graphs and Large Language Models (LLMs) Practical Applications and Projects Challenges and Learnings Anahita’s Recommendations

Links:

GitHub repo: https://github.com/antahiap/ADPT-LRN-PHYS/tree/main

Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Inclusive Data Leadership Coaching - Tereza Iofciu

2024-03-29 Listen
podcast_episode

We talked about:

Tereza’s background Switching from an Individual Contributor to Lead Python Pizza and the pizza management metaphor Learning to figure things out on your own and how to receive feedback Tereza as a leadership coach Podcasts Tereza’s coaching framework (selling yourself vs bragging) The importance of retrospectives The importance of communication and active listening Convincing people you don’t have power over Building relationships and empathy Inclusive leadership

Links:

LinkedIn: https://www.linkedin.com/in/tereza-iofciu/ Twitter: https://twitter.com/terezaif Github: https://github.com/terezaif Website: https:// terezaiofciu.com

Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Building Production Search Systems - Daniel Svonava

2024-03-22 Listen
podcast_episode

Links:

VectorHub: https://superlinked.com/vectorhub/?utm_source=community&utm_medium=podcast&utm_campaign=datatalks Daniel's LinkedIn: https://www.linkedin.com/in/svonava/

Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

​This podcast is sponsored by VectorHub, a free open-source learning community for all things vector embeddings and information retrieval systems.

Building Machine Learning Products - Reem Mahmoud

2024-03-16 Listen
podcast_episode

We talked about:

Reem’s background Context-aware sensing and transfer learning Shifting focus from PhD to industry Reem’s experience with startups and dealing with prejudices towards PhDs AI interviewing solution How candidates react to getting interviewed by an AI avatar End-to-end overview of a machine learning project The pitfalls of using LLMs in your process Mitigating biases Addressing specific requirements for specific roles Reem’s resource recommendations

Links:

LinkedIn: https://www.linkedin.com/in/reemmahmoud/recent-activity/all/ Website: https://topmate.io/reem_mahmoud

Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Make an Impact Through Volunteering Open Source Work - Sara EL-ATEIF

2024-02-23 Listen
podcast_episode
Sara EL-ATEIF (Google)

We talked about:

Sara’s background On being a Google PhD fellow Sara’s volunteer work Finding AI volunteer work Sara’s Fruit Punch challenge How to take part in AI challenges AI Wonder Girls Hackathons Things people often miss in AI projects and hackathons Getting creative Fostering your social media Tips on applying for volunteer projects Why it’s worth doing volunteer projects Opportunities for data engineers and students Sara’s newsletter suggestions

Links:

Dev and AI hackathons: https://devpost.com/ Healthcare-focused challenges: https://grand-challenge.org/challenges/ Volunteering in projects (AI4Good): https://www.fruitpunch.ai/ Volunteering in projects (AI4Good) 2: https://www.omdena.com/ Twitter: https://twitter.com/el_ateifSara Instagram: https://www.instagram.com/saraelateif/ LinkedIn: https://www.linkedin.com/in/sara-el-ateif/ Youtube: www.youtube.com/@elateifsara

Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Accelerating The Job Hunt for The Perfect Job in Tech - Sarah Mestiri

2024-02-02 Listen
podcast_episode
Sarah Mestiri (Thriving Career Moms)

We talked about:

Sarah’s background How Sarah became a coach and found her niche Sarah’s clients How Sarah helps her clients find the perfect job Finding a specialization Informational interviews Building a connection for mutual benefit The networking strategy Listing your projects in the CV The importance of doing research yourself and establishing your interests How to land a part-time job when the company wants full-time Age is not a factor Applying for jobs after finishing a course and the importance of sharing your learnings Sarah resource recommendations

Links:

LinkedIn: https://www.linkedin.com/in/sarahmestiri/ Website: https://thrivingcareermoms.com/ Personal Website: https://www.sarahmestiri.com/ Youtube channel: https://www.youtube.com/@thrivingcareermoms444

Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Machine Learning Engineering in Finance - Nemanja Radojkovic

2024-01-31 Listen
podcast_episode

We talked about:

Nemanja’s background

When Nemanja first work as a data person Typical problems that ML Ops folks solve in the financial sector What Nemanja currently does as an ML Engineer The obstacle of implementing new things in financial sector companies Going through the hurdles of DevOps Working with an on-premises cluster “ML Ops on a Shoestring” (You don’t need fancy stuff to start w/ ML Ops) Tactical solutions Platform work and code work Programming and soft skills needed to be an ML Engineer The challenges of transitioning from and electrical engineering and sales to ML Ops The ML Ops tech stack for beginners Working on projects to determine which skills you need

Links:

LinkedIn: https://www.linkedin.com/in/radojkovic/

Free Data Engineering course: https://github.com/DataTalksClub/data-engineering-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Navigating Challenges and Innovations in Search Technologies - Atita Arora

2023-12-27 Listen
podcast_episode

We talked about:

Atita’s background How NLP relates to search Atita’s experience with Lucidworks and OpenSource Connections Atita’s experience with Qdrant and vector databases Utilizing vector search Major changes to search Atita has noticed throughout her career RAG (Retrieval-Augmented Generation) Building a chatbot out of transcripts with LLMs Ingesting the data and evaluating the results Keeping humans in the loop Application of vector databases for machine learning Collaborative filtering Atita’s resource recommendations

Links:

LinkedIn: https://www.linkedin.com/in/atitaarora/
Twitter: https://x.com/atitaarora Github: https://github.com/atarora Human-in-the-Loop Machine Learning: https://www.manning.com/books/human-in-the-loop-machine-learning Relevant Search: https://www.manning.com/books/relevant-search Let's learn about Vectors: https://hub.superlinked.com/ Langchain: https://python.langchain.com/docs/get_started/introduction Qdrant blog: https://blog.qdrant.tech/ OpenSource Connections Blog: https://opensourceconnections.com/blog/

Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

The Entrepreneurship Journey: From Freelancing to Starting a Company - Adrian Brudaru

2023-12-19 Listen
podcast_episode

We talked about:

Adrian’s background The benefits of freelancing Having an agency vs freelancing What let Adrian switch over from freelancing The conception of DLT (Growth Full Stack) The investment required to start a company Growth through the provision of services Growth through teaching (product-market fit) Moving on to creating docs Adrian’s current role Strategic partnerships and community growth through DocDB Plans for the future of DLT DLT vs Airbyte vs Fivetran Adrian’s resource recommendations

Links:

Adrian's LinkedIn: https://www.linkedin.com/in/data-team/ Twitter: https://twitter.com/dlt_library Github: https://github.com/dlt-hub/dlt Website: https://dlthub.com/docs/intro

Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Collaborative Data Science in Business - Ioannis Mesionis

2023-10-27 Listen
podcast_episode

Links:

LinkedIn: https://www.linkedin.com/in/ioannis-mesionis/
Github: https://github.com/ioannismesionis Website: https://ioannismesionis.github.io/

Free ML Engineering course: http://mlzoomcamp.com Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Data Engineering for Fraud Prevention - Angela Ramirez

2023-10-06 Listen
podcast_episode
Angela Ramirez (Sam's Club)

We talked about:

Angela's background Angela's role at Sam's Club The usefulness of knowing ML as a data engineer Angela's career path Transitioning from data analyst to data engineer/system designer Best practices for system design and data engineering Working with document databases Working with network-based databases Detecting fraud with a network-based database Selecting the database type to work with Neo4j vs Postgres The importance of having software engineering knowledge in data engineering Data quality check tooling The greatest challenges in data engineering Debugging and finding the root cause of a failed job What kinds of tools Angela uses on a daily basis Working with external data sources Angela's resource recommendations

Links:

LinkedIn: https://www.linkedin.com/in/aramirez1305/ Twitter: https://twitter.com/angelamaria__r Github: https://github.com/aramir62 Previous podcast talk: https://twitter.com/i/spaces/1OwGWwZAZDnGQ?s=20

Free ML Engineering course: http://mlzoomcamp.com

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Pragmatic and Standardized MLOps - Maria Vechtomova

2023-09-08 Listen
podcast_episode
Maria Vechtomova (Marvelous MLOps)

We talked about:

Maria's background Marvelous MLOps Maria's definition of MLOps Alternate team setups without a central MLOps team Pragmatic vs non-pragmatic MLOps Must-have ML tools (categories) Maturity assessment What to start with in MLOps Standardized MLOps Convincing DevOps to implement Understanding what the tools are used for instead of knowing all the tools Maria's next project plans Is LLM Ops a thing? What Ahold Delhaize does Resource recommendations to learn more about MLOps The importance of data engineering knowledge for ML engineers

Links:

LinkedIn: https://www.linkedin.com/company/marvelous-mlops/

Website: https://marvelousmlops.substack.com/

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Democratizing Causality - Aleksander Molak

2023-08-25 Listen
podcast_episode

We talked about:

Aleksander's background Aleksander as a Causal Ambassador Using causality to make decisions Counterfactuals and and Judea Pearl Meta-learners vs classical ML models Average treatment effect Reducing causal bias, the super efficient estimator, and model uplifting Metrics for evaluating a causal model vs a traditional ML model Is the added complexity of a causal model worth implementing? Utilizing LLMs in causal models (text as outcome) Text as treatment and style extraction The viability of A/B tests in causal models Graphical structures and nonparametric identification Aleksander's resource recommendations

Links:

The Book of Why: https://amzn.to/3OZpvBk Causal Inference and Discovery in Python: https://amzn.to/46Pperr Book's GitHub repo: https://github.com/PacktPublishing/Causal-Inference-and-Discovery-in-Python The Battle of Giants: Causality vs NLP (PyData Berlin 2023): https://www.youtube.com/watch?v=Bd1XtGZhnmw New Frontiers in Causal NLP (papers repo): https://bit.ly/3N0TFTL

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Mastering Data Engineering as a Remote Worker - José María Sánchez Salas

2023-08-18 Listen
podcast_episode

We talked about:

José's background How José relocated to Norway and his schedule Tech companies in Norway and José role Challenges of working as a remote data engineer José's newsletter on how to make use of data The process of making data useful Where José gets inspiration for his newsletter Dealing with burnout When in Norway, do as the Norwegians do The legalities of working remotely in Norway The benefits of working remotely

Links:

LinkedIn: https://www.linkedin.com/in/jmssalas Github: https://github.com/jmssalas Website & Newsletter: https://jmssalas.com

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

The Good, the Bad and the Ugly of GPT - Sandra Kublik

2023-08-04 Listen
podcast_episode

We talked about:

Sandra's background Making a YouTube channel to break into the LLM space The business cases for LLMs LLMs as amplifiers The befits of keeping a human in the loop when using LLMs (AI limitations) Using LLMs as assistants Building an app that uses an LLM Prompt whisperers and how to improve your prompts Sandra's 7-day LLM experiment Sandra's LLM content recommendations Finding Sandra online

Links:

LinkedIn: https://www.linkedin.com/in/sandrakublik/ Twitter: https://twitter.com/sandra_kublik Youtube: https://www.youtube.com/@sandra_kublik

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

LLMs for Everyone - Meryem Arik

2023-07-28 Listen
podcast_episode
Meryem Arik (TitanML)

We talked about:

Meryam's background The constant evolution of startups How Meryam became interested in LLMs What is an LLM (generative vs non-generative models)? Why LLMs are important Open source models vs API models What TitanML does How fine-tuning a model helps in LLM use cases Fine-tuning generative models How generative models change the landscape of human work How to adjust models over time Vector databases and LLMs How to choose an open source LLM or an API Measuring input data quality Meryam's resource recommendations

Links:

Website: https://www.titanml.co/ Beta docs: https://titanml.gitbook.io/iris-documentation/overview/guide-to-titanml... Using llama2.0 in TitanML Blog: https://medium.com/@TitanML/the-easiest-way-to-fine-tune-and-inference-llama-2-0-8d8900a57d57 Discord: https://discord.gg/83RmHTjZgf Meryem LinkedIn: https://www.linkedin.com/in/meryemarik/

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Investing in Open-Source Data Tools - Bela Wiertz

2023-07-21 Listen
podcast_episode
Bela Wiertz (TKM Family Office)

We talked about:

Bela's background Why startups even need investors Why open source is a viable go-to-market strategy Building a bottom-up community The investment thesis for the TKM Family Office and the blurriness of the funding round naming convention Angel investors vs VC Funds vs family offices Bela's investment criteria and GitHub stars as a metric Inbound sourcing, outbound sourcing, and investor networking Making a good impression on an investor Balancing open and closed source parts of a product The future of open source Recent successes of open source companies Bela's resource recommendations

Links:

Understand who is engaging with your open source project article: https://www.crowd.dev/ Top 6 Books on Developer Community Building: https://www.crowd.dev/post/top-6-books-on-developer-community-building Which open source software metrics matter: https://www.bvp.com/atlas/measuring-the-engagement-of-an-open-source-software-community#Which-open-source-software-metrics-matter

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Why Machine Learning Design is Broken - Valerii Babushkin

2023-07-14 Listen
podcast_episode

Links:

Book: https://www.manning.com/books/machine-learning-system-design?utm_source=AGMLBookcamp&utm_medium=affiliate&utm_campaign=book_babushkin_machine_4_25_23&utm_content=twitter Discount: poddatatalks21 (35% off) Evidently: https://www.evidentlyai.com/ Article: https://medium.com/people-ai-engineering/design-documents-for-ml-models-bbcd30402ff7

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

From Scratch to Success: Building an MLOps Team and ML Platform - Simon Stiebellehner

2023-06-30 Listen
podcast_episode

We talked about:

Simon's background What MLOps is and what it isn't Skills needed to build an ML platform that serves 100s of models Ranking the importance of skills The point where you should think about building an ML platform The importance of processes in ML platforms Weighing your options with SaaS platforms The exploratory setup, experiment tracking, and model registry What comes after deployment? Stitching tools together to create an ML platform Keeping data governance in mind when building a platform What comes first – the model or the platform? Do MLOps engineers need to have deep knowledge of how models work? Is API design important for MLOps? Simon's recommendations for furthering MLOps knowledge

Links:

LinkedIn: https://www.linkedin.com/in/simonstiebellehner/ Github: https://github.com/stiebels Medium: https://medium.com/@sistel

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

From MLOps to DataOps - Santona Tuli

2023-06-23 Listen
podcast_episode

We talked about:

Santona's background Focusing on data workflows Upsolver vs DBT ML pipelines vs Data pipelines MLOps vs DataOps Tools used for data pipelines and ML pipelines The “modern data stack” and today's data ecosystem Staging the data and the concept of a “lakehouse” Transforming the data after staging What happens after the modeling phase Human-centric vs Machine-centric pipeline Applying skills learned in academia to ML engineering Crafting user personas based on real stories A framework of curiosity Santona's book and resource recommendations

Links:

LinkedIn: https://www.linkedin.com/in/santona-tuli/ Upsolver website: upsolver.com Why we built a SQL-based solution to unify batch and stream workflows: https://www.upsolver.com/blog/why-we-built-a-sql-based-solution-to-unify-batch-and-stream-workflows

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Data Developer Relations - Hugo Bowne-Anderson

2023-06-16 Listen
podcast_episode
Hugo Bowne-Anderson (DataCamp)

We talked about:

Hugo's background Why do tools and the companies that run them have wildly different names Hugo's other projects beside Metaflow Transitioning from educator to DevRel What is DevRel? DevRel vs Marketing How DevRel coordinates with developers How DevRel coordinates with marketers What skills a DevRel needs The challenges that come with being an educator Becoming a good writer: nature vs nurture Hugo's approach to writing and suggestions Establishing a goal for your content Choosing a form of media for your content Is DevRel intercompany or intracompany? The Vanishing Gradients podcast Finding Hugo online

Links:

Hugo Browne's github: http://hugobowne.github.io/ Vanishing Gradients: https://vanishinggradients.fireside.fm/ MLOps and DevOps: Why Data Makes It Differenthttps://www.oreilly.com/radar/mlops-and-devops-why-data-makes-it-different/ Evaluate Metaflow for free, right from your Browser: https://outerbounds.com/sandbox/

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html

Lessons Learned from Freelancing and Working in a Start-up - Antonis Stellas

2023-06-09 Listen
podcast_episode

We talked about;

Antonis' background The pros and cons of working for a startup Useful skills for working at a startup and the Lean way to work How Antonis joined the DataTalks.Club community Suggestions for students joining the MLOps course Antonis contributing to Evidently AI How Antonis started freelancing Getting your first clients on Upwork Pricing your work as a freelancer The process after getting approved by a client Wearing many hats as a freelancer and while working at a startup Other suggestions for getting clients as a freelancer Antonis' thoughts on the Data Engineering course Antonis' resource recommendations

Links:

Lean Startup by Eric Ries: https://theleanstartup.com/ Lean Analytics: https://leananalyticsbook.com/ Designing Machine Learning Systems by Chip Huyen: https://www.oreilly.com/library/view/designing-machine-learning/9781098107956/ Kafka Streaming with python by Khris Jenkins tutorial video: https://youtu.be/jItIQ-UvFI4

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html

Data Access Management - Bart Vandekerckhove

2023-06-02 Listen
podcast_episode

We talked about:

Bart's background What is data governance? Data dictionaries and data lineage Data access management How to learn about data governance What skills are needed to do data governance effectively When an organization needs to start thinking about data governance Good data access management processes Data masking and the importance of automating data access DPO and CISO roles How data access management works with a data mesh approach Avoiding the role explosion problem The importance of data governance integration in DataOps Terraform as a stepping stone to data governance How Raito can help an organization with data governance Open-source data governance tools

Links:

LinkedIn: https://www.linkedin.com/in/bartvandekerckhove/ Twitter: https://twitter.com/Bart_H_VDK Github: https://github.com/raito-io Website: https://www.raito.io/ Data Mesh Learning Slack: https://data-mesh-learning.slack.com/join/shared_invite/zt-1qs976pm9-ci7lU8CTmc4QD5y4uKYtAA#/shared-invite/email DataQG Website: https://dataqg.com/ DataQG Slack: https://dataqgcommunitygroup.slack.com/join/shared_invite/zt-12n0333gg-iTZAjbOBeUyAwWr8I~2qfg#/shared-invite/email DMBOK (Data Management Book of Knowledge): https://www.dama.org/cpages/body-of-knowledge DMBOK Wheel describing the data governance activities: https://www.dama.org/cpages/dmbok-2-wheel-images

Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp

Join DataTalks.Club: https://datatalks.club/slack.html

Our events: https://datatalks.club/events.html