talk-data.com
People (58 results)
See all 58 →Activities & events
| Title & Speakers | Event |
|---|---|
|
Why is your Python code slow? Recommendations for improving performance
2025-12-02 · 20:10
It's very likely that throughout your journey with Python, you've heard people say that Python is slow. While there is a gap between interpreted and compiled languages that favors compiled languages, Python has ways to improve the performance of your programs, but these aren't widely known among coders. In this talk, we'll explore some tools and programming patterns that will help you improve the performance of your programs, thereby improving the speed of your applications, tests, and products. After the presentation, you'll have a list of techniques you can apply to your code, as well as the necessary steps to continue exploring code optimization. No prior knowledge of code profilers or advanced techniques is required to attend this talk. |
|
|
Moving beyond Slop Coding
2025-12-02 · 19:40
Matt Harrison
– Python expert
AI can type faster than you. However, it has been trained on lots of naive or poor code (and a little decent code). Let's explore how you can take advantage of software engineering (and Python) best practices to help tame the bias of the AIs. |
|
|
How I Learned to Stop Worrying and Love Generators
2025-12-02 · 19:10
Paweł Wiszniewski
– Senior Data Engineer
@ Flink SE
Python's generators offer a simple, elegant way to build lightweight data pipelines. In this talk, we’ll break down generator functions and expressions and walk through practical Data Engineering examples: streaming large datasets in chunks, transforming records without exhausting memory, and using yield for clean setup and teardown. A concise tour of how generators can make data workflows more efficient—and more elegant. |
|
|
LIVE TRAINING "Machine Learning with XGBoost"
2024-07-18 · 16:00
It is PAID event. You may access training with paid subscription to Ai+ training platform or purchase access to this specific session. More details are here - https://hubs.li/Q02F61jN0 This workshop will show how to use XGBoost. It will demonstrate model creation, model tuning, model evaluation, and model interpretation. The XGBoost library is one of the most popular libraries with data scientists for creating predictive models with structured (or tabular) data. This workshop will cover the library, tuning it, evaluating models created by it, and understanding predictions from it. Attendees will have the chance to try it out with the labs. Instructor's bio: Matt Harrison\, Python & Data Science Corporate Trainer \| Consultant \| MetaSnake Matt Harrison has been using Python since 2000. He runs MetaSnake, a Python and Data Science consultancy and corporate training shop. In the past, he has worked across the domains of search, build management and testing, business intelligence, and storage. He has presented and taught tutorials at conferences such as Strata, SciPy, SCALE, PyCON, and OSCON as well as local user conferences. ODSC Links: • Get free access to more talks/trainings like this at Ai+ Training platform: https://hubs.li/H0Zycsf0 • ODSC blog: https://opendatascience.com/ • Facebook: https://www.facebook.com/OPENDATASCI • Twitter: https://twitter.com/_ODSC & @odsc • LinkedIn: https://www.linkedin.com/company/open-data-science • Slack Channel: https://hubs.li/Q02zdcSk0 • Code of conduct: https://odsc.com/code-of-conduct/ |
LIVE TRAINING "Machine Learning with XGBoost"
|
|
LIVE TRAINING "Machine Learning with XGBoost"
2024-07-18 · 16:00
It is PAID event. You may access training with paid subscription to Ai+ training platform or purchase access to this specific session. More details are here - https://hubs.li/Q02F61jN0 This workshop will show how to use XGBoost. It will demonstrate model creation, model tuning, model evaluation, and model interpretation. The XGBoost library is one of the most popular libraries with data scientists for creating predictive models with structured (or tabular) data. This workshop will cover the library, tuning it, evaluating models created by it, and understanding predictions from it. Attendees will have the chance to try it out with the labs. Instructor's bio: Matt Harrison\, Python & Data Science Corporate Trainer \| Consultant \| MetaSnake Matt Harrison has been using Python since 2000. He runs MetaSnake, a Python and Data Science consultancy and corporate training shop. In the past, he has worked across the domains of search, build management and testing, business intelligence, and storage. He has presented and taught tutorials at conferences such as Strata, SciPy, SCALE, PyCON, and OSCON as well as local user conferences. ODSC Links: • Get free access to more talks/trainings like this at Ai+ Training platform: https://hubs.li/H0Zycsf0 • ODSC blog: https://opendatascience.com/ • Facebook: https://www.facebook.com/OPENDATASCI • Twitter: https://twitter.com/_ODSC & @odsc • LinkedIn: https://www.linkedin.com/company/open-data-science • Slack Channel: https://hubs.li/Q02zdcSk0 • Code of conduct: https://odsc.com/code-of-conduct/ |
LIVE TRAINING "Machine Learning with XGBoost"
|
|
LIVE TRAINING "Machine Learning with XGBoost"
2024-07-18 · 16:00
It is PAID event. You may access training with paid subscription to Ai+ training platform or purchase access to this specific session. More details are here - https://hubs.li/Q02F61jN0 This workshop will show how to use XGBoost. It will demonstrate model creation, model tuning, model evaluation, and model interpretation. The XGBoost library is one of the most popular libraries with data scientists for creating predictive models with structured (or tabular) data. This workshop will cover the library, tuning it, evaluating models created by it, and understanding predictions from it. Attendees will have the chance to try it out with the labs. Instructor's bio: Matt Harrison\, Python & Data Science Corporate Trainer \| Consultant \| MetaSnake Matt Harrison has been using Python since 2000. He runs MetaSnake, a Python and Data Science consultancy and corporate training shop. In the past, he has worked across the domains of search, build management and testing, business intelligence, and storage. He has presented and taught tutorials at conferences such as Strata, SciPy, SCALE, PyCON, and OSCON as well as local user conferences. ODSC Links: • Get free access to more talks/trainings like this at Ai+ Training platform: https://hubs.li/H0Zycsf0 • ODSC blog: https://opendatascience.com/ • Facebook: https://www.facebook.com/OPENDATASCI • Twitter: https://twitter.com/_ODSC & @odsc • LinkedIn: https://www.linkedin.com/company/open-data-science • Slack Channel: https://hubs.li/Q02zdcSk0 • Code of conduct: https://odsc.com/code-of-conduct/ |
LIVE TRAINING "Machine Learning with XGBoost"
|
|
LIVE TRAINING "Machine Learning with XGBoost"
2024-07-18 · 16:00
It is PAID event. You may access training with paid subscription to Ai+ training platform or purchase access to this specific session. More details are here - https://hubs.li/Q02F61jN0 This workshop will show how to use XGBoost. It will demonstrate model creation, model tuning, model evaluation, and model interpretation. The XGBoost library is one of the most popular libraries with data scientists for creating predictive models with structured (or tabular) data. This workshop will cover the library, tuning it, evaluating models created by it, and understanding predictions from it. Attendees will have the chance to try it out with the labs. Instructor's bio: Matt Harrison\, Python & Data Science Corporate Trainer \| Consultant \| MetaSnake Matt Harrison has been using Python since 2000. He runs MetaSnake, a Python and Data Science consultancy and corporate training shop. In the past, he has worked across the domains of search, build management and testing, business intelligence, and storage. He has presented and taught tutorials at conferences such as Strata, SciPy, SCALE, PyCON, and OSCON as well as local user conferences. ODSC Links: • Get free access to more talks/trainings like this at Ai+ Training platform: https://hubs.li/H0Zycsf0 • ODSC blog: https://opendatascience.com/ • Facebook: https://www.facebook.com/OPENDATASCI • Twitter: https://twitter.com/_ODSC & @odsc • LinkedIn: https://www.linkedin.com/company/open-data-science • Slack Channel: https://hubs.li/Q02zdcSk0 • Code of conduct: https://odsc.com/code-of-conduct/ |
LIVE TRAINING "Machine Learning with XGBoost"
|
|
LIVE TRAINING "Machine Learning with XGBoost"
2024-07-18 · 16:00
It is PAID event. You may access training with paid subscription to Ai+ training platform or purchase access to this specific session. More details are here - https://hubs.li/Q02F61jN0 This workshop will show how to use XGBoost. It will demonstrate model creation, model tuning, model evaluation, and model interpretation. The XGBoost library is one of the most popular libraries with data scientists for creating predictive models with structured (or tabular) data. This workshop will cover the library, tuning it, evaluating models created by it, and understanding predictions from it. Attendees will have the chance to try it out with the labs. Instructor's bio: Matt Harrison\, Python & Data Science Corporate Trainer \| Consultant \| MetaSnake Matt Harrison has been using Python since 2000. He runs MetaSnake, a Python and Data Science consultancy and corporate training shop. In the past, he has worked across the domains of search, build management and testing, business intelligence, and storage. He has presented and taught tutorials at conferences such as Strata, SciPy, SCALE, PyCON, and OSCON as well as local user conferences. ODSC Links: • Get free access to more talks/trainings like this at Ai+ Training platform: https://hubs.li/H0Zycsf0 • ODSC blog: https://opendatascience.com/ • Facebook: https://www.facebook.com/OPENDATASCI • Twitter: https://twitter.com/_ODSC & @odsc • LinkedIn: https://www.linkedin.com/company/open-data-science • Slack Channel: https://hubs.li/Q02zdcSk0 • Code of conduct: https://odsc.com/code-of-conduct/ |
LIVE TRAINING "Machine Learning with XGBoost"
|
|
LIVE TRAINING "Machine Learning with XGBoost"
2024-07-18 · 16:00
It is PAID event. You may access training with paid subscription to Ai+ training platform or purchase access to this specific session. More details are here - https://hubs.li/Q02F61jN0 This workshop will show how to use XGBoost. It will demonstrate model creation, model tuning, model evaluation, and model interpretation. The XGBoost library is one of the most popular libraries with data scientists for creating predictive models with structured (or tabular) data. This workshop will cover the library, tuning it, evaluating models created by it, and understanding predictions from it. Attendees will have the chance to try it out with the labs. Instructor's bio: Matt Harrison\, Python & Data Science Corporate Trainer \| Consultant \| MetaSnake Matt Harrison has been using Python since 2000. He runs MetaSnake, a Python and Data Science consultancy and corporate training shop. In the past, he has worked across the domains of search, build management and testing, business intelligence, and storage. He has presented and taught tutorials at conferences such as Strata, SciPy, SCALE, PyCON, and OSCON as well as local user conferences. ODSC Links: • Get free access to more talks/trainings like this at Ai+ Training platform: https://hubs.li/H0Zycsf0 • ODSC blog: https://opendatascience.com/ • Facebook: https://www.facebook.com/OPENDATASCI • Twitter: https://twitter.com/_ODSC & @odsc • LinkedIn: https://www.linkedin.com/company/open-data-science • Slack Channel: https://hubs.li/Q02zdcSk0 • Code of conduct: https://odsc.com/code-of-conduct/ |
LIVE TRAINING "Machine Learning with XGBoost"
|
|
Matt Harrison - Self Publishing Technical Books, Working With Publishers, Book Piracy, and More
2023-12-06 · 16:17
Matt Harrison
– Python expert
,
Joe Reis
– founder
@ Ternary Data
Matt Harrison is the author of many of the most successful Python books, including Effective Pandas, Effective XGBoost, The Machine Learning Pocket Reference, and many more. I consider him the top author of Python books and content on the planet. He stopped by my house to chat about self-publishing technical books, the pros and cons of using a publisher, book piracy, and much more. We both talk about our experiences as best-selling technical authors, and don't hold back in this wide ranging and very candid conversation. Enjoy! Note - my audio got a bit clippy in spots. Sorry if I blew up your speaker. |
The Joe Reis Show |
|
Effective Pandas Patterns For Data Engineering
2022-01-31 · 01:00
Matt Harrison
– Python expert
,
Tobias Macey
– host
Summary Pandas is a powerful tool for cleaning, transforming, manipulating, or enriching data, among many other potential uses. As a result it has become a standard tool for data engineers for a wide range of applications. Matt Harrison is a Python expert with a long history of working with data who now spends his time on consulting and training. He recently wrote a book on effective patterns for Pandas code, and in this episode he shares advice on how to write efficient data processing routines that will scale with your data volumes, while being understandable and maintainable. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Today’s episode is Sponsored by Prophecy.io – the low-code data engineering platform for the cloud. Prophecy provides an easy-to-use visual interface to design & deploy data pipelines on Apache Spark & Apache Airflow. Now all the data users can use software engineering best practices – git, tests and continuous deployment with a simple to use visual designer. How does it work? – You visually design the pipelines, and Prophecy generates clean Spark code with tests on git; then you visually schedule these pipelines on Airflow. You can observe your pipelines with built in metadata search and column level lineage. Finally, if you have existing workflows in AbInitio, Informatica or other ETL formats that you want to move to the cloud, you can import them automatically into Prophecy making them run productively on Spark. Create your free account today at dataengineeringpodcast.com/prophecy. The only thing worse than having bad data is not knowing that you have it. With Bigeye’s data observability platform, if there is an issue with your data or data pipelines you’ll know right away and can get it fixed before the business is impacted. Bigeye let’s data teams measure, improve, and communicate the quality of your data to company stakeholders. With complete API access, a user-friendly interface, and automated yet flexible alerting, you’ve got everything you need to establish and maintain trust in your data. Go to dataengineeringpodcast.com/bigeye today to sign up and start trusting your analyses. Your host is Tobias Macey and today I’m interviewing Matt Harrison about useful tips for using Pandas for data engineering projects Interview Introduction How did you get involved in the area of data management? What are the main tasks that you have seen Pandas used for in a data engineering context? What are some of the common mistakes that can lead to poor performance when scaling to large data sets? What are some of the utility features that you have found most helpful for data processing? One of the interesting add-ons to Pandas is its integration with Arrow. What are some of the considerations for how and when to use the Arrow capabilities vs. out-of-the-box Pandas? Pandas is a tool that spans data processing and data science. What are some of the ways that data engineers should think about writing their code to make it accessible to data scientists for supporting collaboration across data workflows? Pandas is often used for transformation logic. What are some of the ways that engineers should approach the design of their code to make it understandable and maint |
Data Engineering Podcast |