talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (32 results)

See all 32 →
Showing 6 results

Activities & events

Title & Speakers Event

​During this session, we’ll take a practical look at the Analytics Engineer role: what it actually covers, how it fits into modern data teams, and which skills matter most.

​Rather than a step-by-step tutorial, this talk focuses on core concepts, real examples, and recurring patterns that define the work of Analytics Engineers today.

​We’ll cover:

  • What the Analytics Engineer role really includes and how it complements data engineers, data scientists, and business teams
  • Technical foundations that keep an AE effective: SQL, data modeling, testing, version control, and development workflows
  • Soft skills that quietly shape impact: cross-team communication, stakeholder alignment, and systematic thinking

​The talk draws on key insights from Fundamentals of Analytics Engineering, with Juan sharing lessons learned while writing the book and working with data teams. ​By the end, you’ll have a grounded view of why the Analytics Engineer role exists, how it has evolved, and which capabilities are worth prioritizing if you want to advance in this career. ​

​About the speaker: ​Juan Manuel Perafan is an analytics engineer, educator, and community builder based in Utrecht. He’s the co-author of Fundamentals of Analytics Engineering, host of the SQL Lingua Franca podcast, and a dbt Community Award winner. Juan founded the Analytics Engineering Meetup Netherlands and the Dutch dbt Meetup, and has spoken at events like dbt Coalesce, Linux Foundation OS Summit, and Big Data Expo NL.

**Join our slack: https://datatalks.club/slack.html**

Foundations of Analytics Engineer Role: Skills, Scope, and Modern Practices

dbt Meetups are networking events open to all folks working with data! Talks predominantly focus on community members' experience with dbt, however, you'll catch presentations on broader topics such as analytics engineering, data stacks, data ops, modeling, testing, and team structures.

🏠Venue Host: Create 26; 1061 Budapest, Király utca 26. 🍕Catering: snacks and beers 🤝Organizer: Hiflylabs

To attend, please read the Health and Safety Policy and Terms of Participation: https://www.getdbt.com/legal/health-and-safety-policy

📝Agenda

  • 6:30 - 7:00 \| Registration (30 min)
  • 7:00 - 7:05 \| Welcome Remarks (5 min)
  • 7:05 - 7:25 \| Talk 1 (15-20 min)
  • 7:25 - 7:30 \| Q&A (5 min)
  • 7:30 - 7:50 \| Talk 2 (15-20 min)
  • 7:50 - 7:55 \| Q&A (5 min)
  • 7:55 - 8:00 \| Closing Remarks; dbt slides (5 min)
  • 8:00 - 9:00 \| Networking - let's have a chat\, beer and snacks!

🗣️Presentation #1: Switching the query engine under a dbt project - Lessons from the migration trenches

Description: Transitioning a dbt project isn't as easy as flipping a switch, but it can be done smoothly under the right conditions. Our first speaker will share the adventures of moving dbt projects across platforms like Redshift, Snowflake, and Databricks. You'll discover key principles for future migrations and gain practical advice on how to organize your dbt project to stay flexible for whatever comes next.

Speaker bio: Laci Pataki - Senior Data and Analytics Engineer

(Talk in English 🇺🇸)

---

🗣️Presentation #2: How to supercharge your dbt Testing - Best Practices & Pitfalls

Description: Testing is essential for gaining user trust, yet it has traditionally been limited (or fully absent) in SQL workflows. Fortunately, dbt provides a robust set of tools to test dbt projects. In this talk, we will cover essential tips and tricks for testing your dbt project, explore various types of tests, and discuss common pitfalls to avoid.

Speaker bio: Juan Perafan – Analytics Engineer consultant at Xebia, co-author of the book "Fundamentals of Analytics Engineering, host of the "SQL Lingua Franca" podcast series

(Talk in English 🇺🇸)

EVENT DETAILS: The doors open at 6:30pm. Presentations begin at 7:00pm. Food and refreshments will be provided. Our venue has capacity limits, so please only RSVP if you intend to come and and reach out to [email protected] if you need to cancel last minute or change your RSVP status on the Meetup to "Not Going."

➡️ Join the dbt Slack community: https://www.getdbt.com/community/ 🤝For the best Meetup experience, make sure to join the #local-budapest channel in dbt Slack (https://slack.getdbt.com/).

---------------------------------- dbt is the standard in data transformation, used by over 40,000 organizations worldwide. Through the application of software engineering best practices like modularity, version control, testing, and documentation, dbt’s analytics engineering workflow helps teams work more efficiently to produce data the entire organization can trust. Learn more: https://www.getdbt.com/

Budapest dbt Meetup (in-person)
Juan Manuel Perafan – Analytics Engineer @ Xebia

Performance is a crucial factor in delivering timely and accurate data to organizations. However, debugging the performance of dbt models can be a challenge, as most resources available focus on legacy databases or tips for specific data engines that do not translate to modern data platforms.

In this talk, Juan Manuel Perafan focuses on optimizing performance for dbt users, without focusing on any specific data warehouse. He explores the commonalities across most data warehouses and provides practical tips and strategies for improving the performance of dbt models. From query optimization to materialization strategies.

Whether you're new to dbt or a seasoned user, this talk provides valuable insights and best practices for improving the performance of your dbt models.

Speaker: Juan Manuel Perafan, Analytics Engineer, Xebia

Register for Coalesce at https://coalesce.getdbt.com

Analytics dbt DWH
dbt Coalesce 2023

Please note: Registration is mandatory to attend the event. Secure your spot by registering on the GoDataFest website.

The GoDataFest Lite 2023 Summer Edition is around the corner. This full-day event is an opportunity to learn about the latest trends in data, AI/ML, and cloud technology. We're hosting it on-site in Amsterdam on July 5. The event is FREE to attend - all you need to do is register here. Seating is limited!

Engage with product specialists and experienced practitioners of leading tech. Learn from the experiences of industry-leading enterprises.

🗓️ Wednesday, July 5, 2023 📌 Xebia Data Office - Amsterdam, Wibautstraat 202, 1091 GS

Schedule: 9.30 - Open doors

10:00 AM-12:25 AM: Kedro Breakfast Jordi Smit and Caio Benatti Moretti would like to invite you to a Kedro Breakfast session. Here, you'll learn how to manage data pipelines locally and on the cloud, all with the same code. They'll provide guidance on segregating IO artifacts from your code to create a more stable setup, as well as how to review the current status of each pipeline node for efficient debugging and testing.

12.30-13.30: Lunch

13:30-14:10: Scaling Real-Time Streaming Analytics with Apache Flink: Unlocking Business Value and Future-Proofing Enterprises Krzysztof Zarzycki will explore the transformative capabilities of Real-Time Streaming Analytics and the crucial role Apache Flink plays in powering such processes.

14:10-14:55: The multidisciplinary art of creating great KPIs In this workshop, Juan Manuel Perafan will break down all the knowledge required to create robust KPIs. From how it ties to your business goals, perverse incentives, to its technical implementation. Everybody is welcome! No coding required.

15:00-15:40: Data Modelling for Dummies in DBT Lasse Benninga will provide a fundamental understanding of data modeling for analytics, using dbt as an example. By the conclusion, you should have a comprehensive grasp of DBT's function and how analytics engineers worldwide use it to structure their data.

16:00-16:40: Our journey from 0 to MLOps Platform in Snowflake Marcin Zabłocki will show you how we’ve built a usable MLOps platform on Snowflake by combining the powers of Snowpark, MLflow & Kedro to make ML part of the data platform.

16.45-17:20: How to trade repeatability of bad habits for reusability of best practices. A few facts about open-source QuickStart ML Blueprints Piotr Chaberski will dive deep into QuickStart ML Blueprints - an open-source collection of examples for efficient prototyping of production-ready ML solutions that is the concrete embodiment of best development practices ready to apply to new challenges.

17.25-18.05: The Role of Machine Learning System Design in Ensuring MLOps Success During this talk, Roy van Santen will discuss why Machine Learning System Design is important, what it entails, and provide you with a framework on how to approach making a design. Machine Learning System Design takes a holistic view on ML applications.

18.15-19.00: Journey into Gen AI Rens Dimmendaal will address the challenges encountered with ChatGPT and how we tackled them by creating an internal tool, SlackGPT.

19.00 - Pizza & networking

Register now! Seats are limited.

We look forward to seeing you there!

GoDataFest Lite 2023 Summer Edition

The 2nd edition of the Belgium dbt Meetup will take place in Ghent on June 14th.

dbt Meetups are networking events open to all folks working with data! Talks predominantly focus on community members' experience with dbt, however, you'll catch presentations on broader topics such as analytics engineering, data stacks, data ops, modeling, testing, and team structures.

🏠Venue Host: OTA Insight, Gaston Crommenlaan 6, Gent 🤝Organizers: Lise Kerckhove & Sam Debruyn

To attend, please read the Required Participation Language for In-Person Events with dbt Labs: https://bit.ly/3QIJXFb

📝Agenda:

17h45: welcome with food & drinks 18h30: Data Mesh with dbt (Charles Verleyen & Lukasz Sciga - Astrafy) 19h15: Boosting Collaboration and Productivity with DevContainers and dbt Core (Mateusz Marciniewicz - dataroots) 19h45: break 20h: Documenting KPIs: Beyond dbt Semantic Layer (Juan Manuel Perafan - GoDataDriven/Xebia) 20h45: networking & drinks

To attend, please read the Required Participation Language for In-Person Events with dbt Labs: https://bit.ly/3QIJXFb

Data Mesh with dbt (Charles Verleyen & Lukasz Sciga - Astrafy)

Data Mesh has become a powerful buzzwords and every data practitioner has heard about it. The major problem with buzzwords is that there are a lot of interpretations around them and this can generate quite some frustration. In case of data mesh, it can also slow down the adoption of this amazing new paradigm.

In this session I will demystify the main concepts of data mesh and bring those to life with dbt with concrete data products from Astrafy real data. You can expect to see data mesh in actions in a very pragmatic way as we will cover as well activation of data from those data products by downstream applications (leveraging dbt semantic layer).

Charles is a 34 years old data engineer and has been working for the last 10 years in different industries before founding his company recently with a focus on providing modern data services. He is passionate about open-source technologies and strives to deliver automated data solutions that can scale. Aside from his work, he's a sport lover with special dedication for tennis.

Boosting Collaboration and Productivity with DevContainers and dbt Core (Mateusz Marciniewicz - dataroots)

In this presentation we will discuss the benefits of using devcontainers together with dbt Core for data modeling and transformation. DevContainers provide a consistent, isolated environment for development, ensuring that all developers have the same set of tools and dependencies installed. This makes it easier to collaborate and reduces the risk of conflicts or errors. By adding dbt Core to the devcontainer, developers can take advantage of its powerful features and leverage the capabilities of the VS Code IDE. This can greatly simplify the process of working with data and onboarding process of new employees, allowing developers to focus on creating value rather than managing infrastructure.

Mateusz is Polish-born with a background in Computer Science. He came to Belgium to pursue his studies, and he decided to stay and build his career here. In his free time, he enjoys sports and camping.

Documenting KPIs: Beyond dbt Semantic Layer (Juan Manuel Perafan - GoDataDriven/Xebia)

The dbt Semantic Layer has made a lot of teams reflect on the way they document their KPI's. But why stopping there? In this talk, I will break down all the knowledge required to understand a metric. From how it ties to your business goals, perverse incentives, to its technical implementation.

Juan's mission is to help data professionals to create reliable production-grade data workflows that are easy to maintain, have been properly tested, and are impeccably documented. Half of the time, he teaches analysts how to apply engineering best practices to their workflows. The other half, he helps engineers to get in the shoes of the analysts.

---------------

➡️ Join the dbt Slack community: https://www.getdbt.com/community/ 🤝 For the best Meetup experience, make sure to join the #local-belgium channel in dbt Slack (https://slack.getdbt.com/)!

---------------

dbt is a data transformation framework that lets analysts and engineers collaborate using their shared knowledge of SQL. Through the application of software engineering best practices like modularity, version control, testing, and documentation, dbt’s analytics engineering workflow helps teams work more efficiently to produce data the entire organization can trust.

Learn more: https://www.getdbt.com/

Belgium dbt Meetup #2 (in-person)
Bram Ochsendorf – Consultant @ GoDataDriven , Guillermo Sanchez – Consultant @ GoDataDriven , Juan Perafan – Consultant @ GoDataDriven , Tobias Macey – host

Summary We have been building platforms and workflows to store, process, and analyze data since the earliest days of computing. Over that time there have been countless architectures, patterns, and "best practices" to make that task manageable. With the growing popularity of cloud services a new pattern has emerged and been dubbed the "Modern Data Stack". In this episode members of the GoDataDriven team, Guillermo Sanchez, Bram Ochsendorf, and Juan Perafan, explain the combinations of services that comprise this architecture, share their experiences working with clients to employ the stack, and the benefits of bringing engineers and business users together with data.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management You listen to this show to learn about all of the latest tools, patterns, and practices that power data engineering projects across every domain. Now there’s a book that captures the foundational lessons and principles that underly everything that you hear about here. I’m happy to announce I collected wisdom from the community to help you in your journey as a data engineer and worked with O’Reilly to publish it as 97 Things Every Data Engineer Should Know. Go to dataengineeringpodcast.com/97things today to get your copy! When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! RudderStack’s smart customer data pipeline is warehouse-first. It builds your customer data warehouse and your identity graph on your data warehouse, with support for Snowflake, Google BigQuery, Amazon Redshift, and more. Their SDKs and plugins make event streaming easy, and their integrations with cloud applications like Salesforce and ZenDesk help you go beyond event streaming. With RudderStack you can use all of your customer data to answer more difficult questions and then send those insights to your whole customer data stack. Sign up free at dataengineeringpodcast.com/rudder today. We’ve all been asked to help with an ad-hoc request for data by the sales and marketing team. Then it becomes a critical report that they need updated every week or every day. Then what do you do? Send a CSV via email? Write some Python scripts to automate it? But what about incremental sync, API quotas, error handling, and all of the other details that eat up your time? Today, there is a better way. With Census, just write SQL or plug in your dbt models and start syncing your cloud warehouse to SaaS applications like Salesforce, Marketo, Hubspot, and many more. Go to dataengineeringpodcast.com/census today to get a free 14-day trial. Your host is Tobias Macey and today I’m interviewing Guillermo Sanchez, Bram Ochsendorf, and Juan Perafan about their experiences with managed services in the modern data stack in their work as consultants at GoDataDriven

Interview

Introduction How did you get involved in the area of data management? Can you start by giving your definition of the modern data stack?

What are the key characteristics of a tool or platform that make it a candidate for the "modern" stack?

How does the modern data stack shift the responsibilities and capabilities of data professionals and consumers? What are some difficulties that you face when working with customers to migrate to these new architectures? What are some of the limitations of the components or

API BigQuery Cloud Computing CSV Data Engineering Data Management dbt DWH Hubspot Kubernetes Marketing Modern Data Stack Python Redshift SaaS Snowflake SQL Data Streaming
Data Engineering Podcast
Showing 6 results