talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (22 results)

See all 22 →
Showing 5 results

Activities & events

Title & Speakers Event
Thijs Nieuwdorp – guest , Jeroen Janssens – guest , Joe Reis – founder @ Ternary Data

Jeroen Janssens and Thijs Nieuwdo join me to chat about all things Polars. We discuss the evolution of the Polars library, its advantages over pandas, and their journey of writing 'Python Polars: The Definitive Guide.'

Pandas Polars Python
The Joe Reis Show
Jeroen Janssens , Thijs Nieuwdorp – guest @ VodafoneZiggo

Learn how to transform your Python code into a command-line tool. Jeroen Janssens, author of Data Science at the Command Line, guides you through the process of turning your scripts into reusable, executable tools, integrating them into your data workflows and harnessing the power of the Unix command line.

Data Science Python Unix
PyData London 2025

External registration required at nyhackr.org.

Note the 7 PM start time.

For the meetup right before the 10th Anniversary New York R Conference we have another long-time meetup member and repeat NYR speaker Jeroen Janssens talking about Polars and plotting.

After the talk we will randomly select two attendees (both in-person and virtual) to receive free tickets to The New York R Conference taking place May 16-17.

Thank you to NYU for hosting us.

Everybody attending must RSVP through the registration form at nyhackr.org. There is a charge for in-person and virtual tickets are free. Space is extremely limited and in-person registration closes at 2 PM the day of the talk.

About the Talk: Sure, Polars is fast, blazingly fast even. But how do you turn those dull DataFrames into something insightful? Fortunately, Python provides a plethora of packages, each with its own set of features, assumptions, and pitfalls.

In this talk, I'll walk you through a couple of packages to create pretty pictures. This includes hvPlot for quick plotting, Bokeh for interactive visualizations, and Plotnine for leveraging the grammar of graphics in Python. In addition, I'll demonstrate the Great Tables package for creating, well, great tables.

By the end you'll have a good idea of what each package has to offer, when to use which, and how to use them in combination with Polars.

About Jeroen: Jeroen Janssens, PhD, is a data science consultant and certified instructor. His expertise lies in visualizing data, implementing machine learning models, and building solutions using Python, R, JavaScript, and Bash. He’s passionate about helping and teaching others to do such things. Jeroen works as a Senior Machine Learning Engineer at Xomnia in Amsterdam. Prior to that, he ran, for six years, his own company, Data Science Workshops, which was a training and coaching firm that organized open enrollment workshops, in-company courses, hackathons, and meetups. Clients included Amazon, eHealth Africa, Schiphol Airport, The New York Times, and T-Mobile.

Previously, he was an assistant professor at Jheronimus Academy of Data Science and a data scientist at Elsevier in Amsterdam and several startups in New York City. He is the author of Data Science at the Command Line (O’Reilly Media, 2021). Jeroen holds a PhD in machine learning from Tilburg University and an MSc in artificial intelligence from Maastricht University.

He lives with his wife and two kids in Rotterdam, the Netherlands.

The venue doors open at 6:30 PM America/New_York where we will continue enjoying pizza together (we encourage the virtual audience to have pizza as well). The talk, and livestream, begins at 7:00 PM America/New_York.

Remember, register at nyhackr.org.

Turning Polars DataFrames into Pretty Pictures and Great Tables

This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed with over 100 Unix power tools--useful whether you work with Windows, macOS, or Linux. You'll quickly discover why the command line is an agile, scalable, and extensible technology. Even if you're comfortable processing data with Python or R, you'll learn how to greatly improve your data science workflow by leveraging the command line's power. This book is ideal for data scientists, analysts, engineers, system administrators, and researchers. Obtain data from websites, APIs, databases, and spreadsheets Perform scrub operations on text, CSV, HTML, XML, and JSON files Explore data, compute descriptive statistics, and create visualizations Manage your data science workflow Create your own tools from one-liners and existing Python or R code Parallelize and distribute data-intensive pipelines Model data with dimensionality reduction, regression, and classification algorithms Leverage the command line from Python, Jupyter, R, RStudio, and Apache Spark

data data-science Agile/Scrum API CSV Data Science Docker HTML JSON Linux Python Spark Unix XML

This hands-on guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You’ll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data. To get you started—whether you’re on Windows, OS X, or Linux—author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools. Discover why the command line is an agile, scalable, and extensible technology. Even if you’re already comfortable processing data with, say, Python or R, you’ll greatly improve your data science workflow by also leveraging the power of the command line. Obtain data from websites, APIs, databases, and spreadsheets Perform scrub operations on plain text, CSV, HTML/XML, and JSON Explore data, compute descriptive statistics, and create visualizations Manage your data science workflow using Drake Create reusable tools from one-liners and existing Python or R code Parallelize and distribute data-intensive pipelines using GNU Parallel Model data with dimensionality reduction, clustering, regression, and classification algorithms

data data-science Agile/Scrum API CSV Data Science HTML JSON Linux Python XML
Showing 5 results