talk-data.com talk-data.com

Filter by Source

Select conferences and events

People (7 results)

See all 7 →

Activities & events

Title & Speakers Event
TidyTuesday 2026-01-27 · 23:00

Join R-Ladies Ottawa for a casual evening of programming on Tuesday, January 27th. We'll be participating in TidyTuesday, a weekly data visualization challenge organized by the R for Data Science community.

What is TidyTuesday?

Every week, a new dataset is posted online on the TidyTuesday GitHub repo, and folks from around the world create data visualizations using the dataset. It's an opportunity to put your programming skills into practice using real-world data in a way that's fun! It's also a great way for everyone to learn from each other, by sharing their visualizations and code.

What will the dataset be?

Even we don't know that (yet)! We'll have to wait until the day before the event to know what data we'll be working with. If you're interested in seeing some past datasets, take a look at the examples below, or visit the TidyTuesday GitHub repo to see all of the datasets dating back to 2018.

Examples from past TidyTuesdays:

Do I have to use R?

No! You can use any programming language or visualization software that you want. In fact, Python users from around the globe participate in "TyDyTuesday" on a weekly basis.

Who is this event for?

No previous programming experience is required to participate, and we'll have experienced programmers in the room who can help you get started (or unstuck), if needed.

...But if you want to get the most out of the event, a good way to prepare is to watch the recording of the introduction to data visualization workshop we hosted back in 2024. :)

What should I bring?

  • Please bring a laptop so you can code along. We recommend that you have RStudio or another IDE (such as VS Code or Positron) installed ahead of time, but we can help you get one installed if needed!
  • Come ready to learn, share, and contribute to a safe and welcoming community!

How will this event work?

  • First few minutes of the event: Introductions, and taking a look at the dataset together as a group.
  • Time to create a data visualization using the language or software of your choice, either on your own or with a (new) friend! Grab a free snack while you're at it :)
  • Last \~30 minutes of the event: Show and tell session for anyone who would like to share their creation with the group.

What else do I need to know?

This event (like all R-Ladies events) is totally FREE to attend.

The event will take place at Bayview Yards, which is located just a few steps away from the Bayview O-Train station. There is also a free parking lot available for those who are driving. You can find us in the "Training Room", which is on the second floor of the Bayview Yards building.

This is an in-person event with limited space! Please only RSVP if you are able to attend in-person!

***Please note that the mission of R-Ladies is to increase gender diversity in the R community. This event is intended to provide a safe space for women and gender minorities. We ask for male allies to be invited by and accompanied by a woman or gender minority.***

We’re grateful to be part of the Bayview Meetups initiative and extend our thanks to Bayview Yards for generously providing the venue space.

TidyTuesday

Link to the Teams Meeting: https://teams.microsoft.com/l/meetup-join/19%3ameeting_NzYwZDU3ZTgtYmZjNS00ZmIyLTgyNDEtZjZlMDQ5MTUwYTJh%40thread.v2/0?context=%7b%22Tid%22%3a%22ecbebdab-1287-47b8-8709-694ffe8697d1%22%2c%22Oid%22%3a%22a901008d-8ba1-459d-b8fa-3eccacfccadf%22%7d

Agenda

18:00 Welcome and Intros

18:10 🎦Map Investigation. Thank you PBIR. 🗣️ Jacek Nosal

Since the first publication of the Preview version of the new PBIR file format, we have received a clear direction with a data visualization layer in Power BI and the ability to look into the settings of various elements. Previously, it was possible to disassemble the PBIX file into prime factors, but working with such a version of the definition was a huge challenge. Now we can learn the structure of the data and test the editing capabilities. When working with data and delivering it to target recipients, the presentation of it on maps arouses a lot of emotions. However, sometimes I have the impression that part of the environment wants to use maps, but does not understand the principles that can and should be guided by. There are limits and parameters, the application of which is incomprehensible. And thanks to the new PBIR format, we can rediscover Power BI. Here is a story about the new format and selected possibilities that can be discovered with its help.

*** A massive thanks to our sponsor Spectrum IT. Without them, we could not host this event for free. Spectrum IT kindly provided funding for the room and pizza for in-person attendees. ***

🎦Map Investigation. Thank you PBIR. 🗣️Jacek Nosal

Power BI on Databricks Best Practices

Want to supercharge the house part of your lake house? This session will cover best practices of building visualization using Power BI on top of Databricks SQL in terms of performance, security and different architecture patterns.

Hi, I am Liping, Data & AI Solution Architect, ex-Microsoft, I specialize in Big Data Analytics, Data Engineering, Data Warehousing and Business Intelligence. In addition to helping customers build out data platforms and solving real-world problems using data

I am also a blogger (https://www.dataleaps.co.uk), YouTuber (https://www.youtube.com/@dataleaps), international public speaker (https://www.sessionize.com/lipinght).

Follow me on socials https://linktr.ee/liping_dataleaps for career in data content & Data & AI tutorials.

Power BI on Databricks Best Practices | Liping Huang
PyData Warsaw #25 2025-01-22 · 18:00

All the ML/AI in 2025! We would like to invite you for the next PyData Warsaw Meetup - 2 great speakers, tons of knowledge and discussions

19:00 - Piotr Mierzejewski, Sunscrapers - "Overview of Data Test Types in dbt: From Built-In to Custom Solutions" About Topic: Ensuring data quality and the correctness of analytical logic is a crucial part of data pipelines. DBT offers a powerful framework for testing, but it also comes with certain limitations. In this session, we will explore the various types of data tests available in DBT and discuss custom approaches you can implement to address specific challenges and tailor testing to your unique needs. About Speaker: I am a Data Engineer with over 4 years of experience and a solid academic background in computer science. Throughout my career, I have worked on projects in the petrochemical, recruitment, healthcare, and financial sectors, which has given me a broad understanding of data across different contexts. This experience allows me to quickly grasp the business goals behind the data and to manage and transform it in ways that generate the greatest value for the organization. On a daily basis, I work with Python and SQL technologies, which are my primary tools. I value these languages for their versatility and capabilities, but I am equally comfortable supporting projects that use Java, JavaScript, or C#. I am not limited to a single technology stack and can quickly adapt to new tools to complete tasks effectively. I have also supported projects from the front-end side (React) and taken on roles as a tester or team leader when project needs required it. I am confident that my flexibility and readiness to take on different roles within a team contribute to achieving the best results. I approach my work with a craftsman’s mindset, focusing on building practical, reliable solutions that solve real problems. I'm committed to constantly improving my skills, adapting to new tools, and finding efficient ways to make data truly useful. For me, success means delivering solid results and adding genuine value to each project I take on.

19:45 - Mateusz Modrzejewski, Politechnika Warszawska - "MIDI, those cheesy sounds from the 90s? Wrong! Symbolic music processing with Python"

About Topic: “MIDI, those cheesy sounds from the 90s?" is an actual question some guy asked me at an AI conference one day. What an inspiring question! This talk flips the outdated view of MIDI as retro noise, showcasing it as a powerful format for representing and analyzing music. While some think it’s obsolete, MIDI remains the backbone of modern music production and a very active research topic in machine learning. This talk unpacks MIDI’s structure and demonstrates how Python libraries like mido, pretty_midi, and MidiTok turn it into a tool for research and creativity. From visualization to music generation, practical examples reveal the modern applications of symbolic music. The takeaway? Python makes it easier than ever to explore MIDI’s potential and apply it to a wide range of musical and analytical tasks. About Speaker: Mateusz Modrzejewski, PhD Software engineer, researcher, conference speaker, author and co-author of papers on music information retrieval and audio AI. Assistant Professor at the Institute of Computer Science of Warsaw University of Technology, where he leads an Audio Intelligence Lab. Previously at Apple (Music Machine Learning team, Apple Music). Has also worked with research and engineering teams of other Fortune 500 companies, providing AI solutions and analytics. Apart from his scientific and engineering work, he is also an experienced touring musician, having performed for audiences of up to 150,000 people and having toured in Poland, China, Vietnam, the UK, Germany, Ukraine, Lithuania and Estonia, among others. Some of the artists he has played with include The Dumplings, Grubson, Marek Dyjak, Chłopcy Kontra Basia, Maria Sadowska, Pablopavo i Ludziki, Majka Jeżowska, Michał Milczarek Trio.

Venue: Centrum Innowacji Politechniki Warszawskiej, ul. Rektorska 4 Room 3.12 (3rd Floor)

PyData Warsaw #25

2024 Posit Conference Watch Party with R-Ladies Ottawa

Posit, the company that makes many of the tools you know and love (RStudio, Shiny, the tidyverse, and many more!) hosts an annual conference. This year, the conference took place in Seattle from August 12th-14th. Did you miss the conference, but still want to know what’s new in the world of data science? Look no further – R-Ladies Ottawa is hosting a posit::conf(2024) Watch Party!

This event will take place on October 23rd, from 5:30-7:30pm, at the Ottawa Public Library (Sunnyside branch - Program room 1B). Attendance is FREE and there will be door prizes available to win (see details below)! We’ll watch a few of the most popular talks from this year’s conference and have opportunities to discuss and network. Afterwards, we’ll be heading to a restaurant in the Glebe for anyone who would like to join!

Please note that the mission of R-Ladies is to increase gender diversity in the R community. While our Meetup Group is open to anyone to join, our events are intended to provide a safe space for women and gender minorities. Male allies may attend our in-person events if they are invited by and accompanied by a woman or gender minority.

Door prize giveaway

By attending this event, you’ll have a chance to win a copy of David Keyes’ new book, R for the Rest of Us: A Statistics-Free Introduction, courtesy of NoStarch Press.

Description of the book: “Learn how to use R for everything from workload automation and creating online reports, to interpreting data, map making, and more.

Written by the founder of a very popular online training platform for the R programming language!

The R programming language is a remarkably powerful tool for data analysis and visualization, but its steep learning curve can be intimidating for some. If you just want to automate repetitive tasks or visualize your data, without the need for complex math, R for the Rest of Us is for you.”

Free stickers!

Posit has generously donated hex stickers for this event! Everyone who attends will be able to take some home.

Event schedule

5:30pm - 7:00pm - Watch party and discussion (see talk descriptions below!) 7:00pm - Networking and announcement of door prize winner 7:30pm - We’ll be heading to a restaurant in the Glebe, for anyone who would like to join!

The talks we’ll watch:

1. Introducing Positron (Julia Silge) and Exploratory Data Analysis in Python with Positron (Isabelle Zimmerman)

Positron is a next generation data science IDE that is newly available to the community for early beta testing. This new IDE is an extensible tool built to facilitate exploratory data analysis, reproducible authoring, and publishing data artifacts. Positron currently supports these data workflows in either or both Python and/or R, and is designed with a forward-looking architecture that can support other data science languages in the future. In this session, learn from the team building Positron about how and why it is designed the way it is, what will feel familiar or new coming from other IDEs, and whether it might be a good fit for your own work.

2. Closeread: bringing Scrollytelling to Quarto - Andrew Bray

Scrollytelling is a style of web design that transitions graphics and text as a user scrolls, allowing stories to progress naturally. Despite its power, scrollytelling typically requires specialist web dev skills beyond the reach of many data scientists. Closeread is a Quarto extension that makes a wide range of scrollytelling techniques available to authors without traditional web dev experience, with support for cross-fading plots, graphics and other chunk output alongside narrative content. You can zoom in on poems, prose and images, as well as highlighting important phrases of text. Finally, Closeread allows authors with experience in Observable JS to write their own animated graphics that update smoothly as scrolling progresses.

3. GitHub: How To Tell Your Professional Story - Abigail Haddad

GitHub is more than just a version control tool, it's a way of explaining your professional identity to prospective employers and collaborators – and you can build your profile now, before you're looking for new opportunities. This talk is about how to think of GitHub as an opportunity, not a chore, and how to represent yourself well without making developing your GitHub profile into a part-time job. I'll talk about why GitHub adds value beyond a personal website, what kinds of projects are helpful to share, and some good development practices to get in the habit of, regardless of your project specifics.

posit::conf(2024) Watch Party

Welcome to our September 2024 Data & AI Meetup. The first will be:

Creating cost transparency in multi-cloud enterprise: An OSS Approach

Gringotts is a cloud transparency tool that shows insights into a business enabling product teams spending in multi-cloud ecosystems, allowing them to drill down on current costs, get insights to make informed decisions about your products, and eventually control costs. It also supplies optimization recommendations to these teams to right-size on their cloud workloads to run on optimal costs & also gives them alerts on anomalies that occur on the cloud workloads in terms of daily expenditure. The solution we have built is completely built by using OSS frameworks, which allows it to be cloud agnostic in nature & effectively run at a very minimal operational expense. The solution itself consists of below mentioned components: - Data ingestion: This is done daily by used APIs from different cloud service providers which we consume using ETL services we have developed using .NET running on containerized workloads in Kubernetes. - Data transformation: This is again done used internal business metadata mapping which lets us associate cloud workloads (at a resource level) to business entities within the organization. This association creates the layer of association which helps us relate costs to a business entity & create transparency. - Data acceleration: Here we used an OSS tool called as Dremio (data lake engine) which helps us create a layer of abstraction on top of our data lake & serves the purpose of creating data dimensions & aggregations which can be used by different visualization tools. - Data visualization: Here again we have used an OSS tool called as Apache Superset (a visualization tool) to offer insights into the data that we have collected & transformed for our needs. Apache Superset offers immense capabilities like support for plugging into OAuth based RBAC, row-level security, custom roles, scheduled reports & alerts etc. Enabling it to be a perfect solution for our visualization needs. - Anomaly Detection & Optimization recommendations: Here we have used ML .NET to create a solution which offers anomaly detection to business entities. This is just a sneak peek into the solution, but I would love to talk more about it in depth in the meetup session on how we have solved this problem for our organization

Speaker: Nitin Sapru I am a Senior Software Developer by day, aspiring tech superhero by night. I've been fighting the forces of software chaos in the IT industry for years, armed with my trusty C# and dotnet for better part of 9 years (and a secret stash of coffee).

There will be Pizza and Networking after the session.

September 2024 Reading Data & AI MeetUp (In-person Only)
Pandas Workout 2024-06-05

Practice makes perfect pandas! Work out your pandas skills against dozens of real-world challenges, each carefully designed to build an intuitive knowledge of essential pandas tasks. In Pandas Workout you’ll learn how to: Clean your data for accurate analysis Work with rows and columns for retrieving and assigning data Handle indexes, including hierarchical indexes Read and write data with a number of common formats, such as CSV and JSON Process and manipulate textual data from within pandas Work with dates and times in pandas Perform aggregate calculations on selected subsets of data Produce attractive and useful visualizations that make your data come alive Pandas Workout hones your pandas skills to a professional-level through two hundred exercises, each designed to strengthen your pandas skills. You’ll test your abilities against common pandas challenges such as importing and exporting, data cleaning, visualization, and performance optimization. Each exercise utilizes a real-world scenario based on real-world data, from tracking the parking tickets in New York City, to working out which country makes the best wines. You’ll soon find your pandas skills becoming second nature—no more trips to StackOverflow for what is now a natural part of your skillset. About the Technology Python’s pandas library can massively reduce the time you spend analyzing, cleaning, exploring, and manipulating data. And the only path to pandas mastery is practice, practice, and, you guessed it, more practice. In this book, Python guru Reuven Lerner is your personal trainer and guide through over 200 exercises guaranteed to boost your pandas skills. About the Book Pandas Workout is a thoughtful collection of practice problems, challenges, and mini-projects designed to build your data analysis skills using Python and pandas. The workouts use realistic data from many sources: the New York taxi fleet, Olympic athletes, SAT scores, oil prices, and more. Each can be completed in ten minutes or less. You’ll explore pandas’ rich functionality for string and date/time handling, complex indexing, and visualization, along with practical tips for every stage of a data analysis project. What's Inside Clean data with less manual labor Retrieving and assigning data Process and manipulate text Calculations on selected data subsets About the Reader For Python programmers and data analysts. About the Author Reuven M. Lerner teaches Python and data science around the world and publishes the “Bamboo Weekly” newsletter. He is the author of Manning’s Python Workout (2020). Quotes A carefully crafted tour through the pandas library, jam-packed with wisdom that will help you become a better pandas user and a better data scientist. - Kevin Markham, Founder of Data School, Creator of pandas in 30 days Will help you apply pandas to real problems and push you to the next level. - Michael Driscoll, RFA Engineering, creator of Teach Me Python The explanations, paired with Reuven’s storytelling and personal tone, make the concepts simple. I’ll never get them wrong again! - Rodrigo Girão Serrão, Python developer and educator The definitive source! - Kiran Anantha, Amazon

data data-science data-science-tools Pandas CSV Data Science JSON Python
O'Reilly Data Science Books
Thabata Romanowski – Data visualization and information design consultant; former data analyst @ Data Rocks NZ , Brian T. O’Neill – host

This week on Experiencing Data, I chat with a new kindred spirit! Recently, I connected with Thabata Romanowski—better known as "T from Data Rocks NZ"—to discuss her experience applying UX design principles to modern analytical data products and dashboards. T walks us through her experience working as a data analyst in the mining sector, sharing the journey of how these experiences laid the foundation for her transition to data visualization. Now, she specializes in transforming complex, industry-specific data sets into intuitive, user-friendly visual representations, and addresses the challenges faced by the analytics teams she supports through her design business. T and I tackle common misconceptions about design in the analytics field, discuss how we communicate and educate non-designers on applying UX design principles to their dashboard and application design work, and address the problem with "pretty charts." We also explore some of the core ideas in T's Design Manifesto, including principles like being purposeful, context-sensitive, collaborative, and humanistic—all aimed at increasing user adoption and business value by improving UX.

Highlights/ Skip to:

I welcome T from Data Rocks NZ onto the show (00:00) T's transition from mining to leading an information design and data visualization consultancy. (01:43) T discusses the critical role of clear communication in data design solutions. (03:39) We address the misconceptions around the role of design in data analytics. (06:54)  T explains the importance of journey mapping in understanding users' needs. (15:25) We discuss the challenges of accurately capturing end-user needs. (19:00)  T and I discuss the importance of talking directly to end-users when developing data products. (25:56)  T shares her 'I like, I wish, I wonder' method for eliciting genuine user feedback. (33:03) T discusses her Data Design Manifesto for creating purposeful, context-aware, collaborative, and human-centered design principles in data. (36:37) We wrap up the conversation and share ways to connect with T. (40:49)

Quotes from Today’s Episode "It's not so much that people…don't know what design is, it's more that they understand it differently from what it can actually do..." - T from Data Rocks NZ (06:59) "I think [misconception about design in technology] is rooted mainly in the fact that data has been very tied to IT teams, to technology teams, and they’re not always up to what design actually does.” - T from Data Rocks NZ (07:42)  “If you strip design of function, it becomes art. So, it’s not art… it’s about being functional and being useful in helping people.” - T from Data Rocks NZ (09:06)

"It’s not that people don’t know, really, that the word design exists, or that design applies to analytics and whatnot; it’s more that they have this misunderstanding that it’s about making things look a certain way, when in fact... It’s about function. It’s about helping people do stuff better." - T from Data Rocks NZ (09:19) “Journey Mapping means that you have to talk to people...  Data is an inherently human thing. It is something that we create ourselves. So, it’s biased from the start. You can’t fully remove the human from the data" - T from Data Rocks NZ (15:36)  “The biggest part of your data product success…happens outside of your technology and outside of your actual analysis. It’s defining who your audience is, what the context of this audience is, and to which purpose do they need that product. - T from Data Rocks NZ (19:08) “[In UX research], a tight, empowered product team needs regular exposure to end customers; there’s nothing that can replace that." - Brian O'Neill (25:58)

“You have two sides [end-users and data team]  that are frustrated with the same thing. The side who asked wasn’t really sure what to ask. And then the data team gets frustrated because the users don’t know what they want…Nobody really understood what the problem is. There’s a lot of assumptions happening there. And this is one of the hardest things to let go.” - T from Data Rocks NZ (29:38) “No piece of data product exists in isolation, so understanding what people do with it… is really important.” - T from Data Rocks NZ (38:51)

Links Design Matters Newsletter: https://buttondown.email/datarocksnz  Website: https://www.datarocks.co.nz/ LinkedIn: https://www.linkedin.com/company/datarocksnz/ BlueSky: https://bsky.app/profile/datarocksnz.bsky.social Mastodon: https://me.dm/@datarocksnz

Analytics Dashboard Data Analytics DataViz
Experiencing Data w/ Brian T. O’Neill (AI & data product management leadership—powered by UX design)

When: Thursday 21st March 2024 Time: arrive for 5:45pm with talks starting from 6pm start prompt Location: Number 1 Spinningfields, 1 Hardman Street, Manchester, M3 3EB Session will not be streamed over MS Teams Complimentary drinks & pizza provided by our hosts & sponsors Robert Walters

March's PBIMCR will feature Pets at Home's Chris Elton & Brook Bracewell and Nat Van Gulck.

Pets at Home “The Journey from Tableau to Power BI self-service”

  • As-Is describing our Tableau viz solution on data sets supporting reporting
  • Setting the Standards
  • Themes
  • Governance & Control
  • Dax Standardisation
  • Domain based semantic models
  • How we identified the domains
  • Aspiring to few models supporting many reports
  • -Considerations on when to use direct query with GCP (Drill throughs) – demo hopefully with some example costs reporting
  • - Challenges on how to aggregate we have distinct counts over time
  • Source control
  • Deployment Patterns
  • Demo pushing a semantic model to git source control explain how azure dev ops is slicker but out of scope
  • Demo updating measures on a deployed model using git and ALM tool kit
  • Demo moving from dev to prod using power bi pipelines

Bio: Brook Bracewell Brook has nearly a decade of experience in data analytics, with a focus on data modelling, SQL, and DAX. Over the past 14 months at Pets at Home, Brook has been developing the Data Warehouse on Google Cloud Platform, specialising in optimising models for Power BI self-service. This role builds on Brook's extensive background in data warehouse implementations and Power BI deployments across various sectors, supporting both small and large enterprises. With a career history as a business analyst, project manager, and operational lead, Brook brings a wealth of expertise to the table in a pragmatic and experienced manner.

Chris Elton has over 10 years experience in BI, mostly working in healthcare and recently moving into retail at Pets. Chris has worked as a BI Analyst, developing dashboards to deliver value and add insight to the business, with a primary focus on customer experience. More recently he has led tool migrations to Power BI, with a user first approach; gaining stakeholder buy in and driving a cultural shift.

PBI Inspector with Nat Van Gulck:

Testing the visual layer of Power BI reports has usually been a manual task. Thanks to the new Power BI Project file format however there is an opportunity to automatically check for visuals' issues around accessibility, consistency and performance and thereby align with your organizations Power BI Centre of Excellence guidelines. In this session we will see how PBI Inspector, an open-source community tool, supports this process.

About Nat: I started my career over twenty years ago as a .NET Software Developer while also developing an interest for Data Visualization. This led me to work on Qlik and Tableau data projects. Then Power BI became so compelling that I had to join Microsoft. As a Cloud Solutions Architect I help customers be successful with Power BI; PBI Inspector is part of this endeavour. The adventure continues with Microsoft Fabric.

PBI Inspector GitHub repository -- https://github.com/NatVanG/PBI-Inspector Tutorial: PBI Inspector as part of an Azure DevOps pipeline: https://learn.microsoft.com/en-us/power-bi/developer/projects/projects-build-pipelines

Pets at Home - Data Journey from Tableau to PBI / Nat van Gulck - PBI Inspector
Data visualization (Part II) 2024-02-22 · 17:00

Join us for a 3-part workshop series meant to introduce you to R and RStudio. We will walk through the basics of how to handle data in R and how you can create data visualizations!

Please note: There is one meetup link per workshop, please register for each workshop separately.

1. Introduction to R and RStudio Date: January 25th, 2024 at 12:00-1:00pm

In this session, Reiko will go through R vocabulary; creating clear, organized scripts; using GitHub repositories for sharing scripts; creating a project; importing data; and troubleshooting.

2. Data visualization (Part I) Date: February 1st, 2024 at 12:00-1:00pm

In this session, Reiko will introduce R Markdown and the package ggplot2. You will see how to create the following data visualizations:

  • histogram
  • scatter plot
  • line plot
  • bar plot

3. Data visualization (Part II) Date: February 22nd, 2024 at 12:00-1:00pm

In this session, Reiko will continue the topic of data visualizations. Using bar plots, you will see how to change colors and overall look and feel of a plot as well as exploring more plot types and how to save our masterpieces!

Data visualization (Part II)
Data visualization (Part I) 2024-02-01 · 17:00

Join us for a 3-part workshop series meant to introduce you to R and RStudio. We will walk through the basics of how to handle data in R and how you can create data visualizations!

Please note: There is one meetup link per workshop, please register for each workshop separately.

1. Introduction to R and RStudio Date: January 25th, 2024 at 12:00-1:00pm

In this session, Reiko will go through R vocabulary; creating clear, organized scripts; using GitHub repositories for sharing scripts; creating a project; importing data; and troubleshooting.

2. Data visualization (Part I) Date: February 1st, 2024 at 12:00-1:00pm

In this session, Reiko will introduce R Markdown and the package ggplot2. You will see how to create the following data visualizations:

  • histogram
  • scatter plot
  • line plot
  • bar plot

3. Data visualization (Part II) Date: February 22nd, 2024 at 12:00-1:00pm

In this session, Reiko will continue the topic of data visualizations. Using bar plots, you will see how to change colors and overall look and feel of a plot as well as exploring more plot types and how to save our masterpieces!

Data visualization (Part I)

This is an Online event, the Teams link is on the right of this page for those who have registered.

18:15- 18:30 - Networking 18:30: VisOps with PBI Inspector – Nat Van Gulck 19:30: Break 19:35: Fabric for the Power BI User – Chris Webb

Session details: 1. VisOps with PBI Inspector Testing the visual layer of Power BI reports has usually been a manual task. Thanks to the new Power BI Project file format however there is an opportunity to automatically check for visuals' issues around accessibility, consistency and performance and thereby align with your organizations Power BI Centre of Excellence guidelines. In this session we will see how PBI Inspector, an open-source community tool, supports this process. PBI Inspector GitHub repository -- https://github.com/NatVanG/PBI-Inspector Tutorial: PBI Inspector as part of an Azure DevOps pipeline: https://learn.microsoft.com/en-us/power-bi/developer/projects/projects-build-pipelines

2. Fabric for the Power BI user So you’re an experienced Power BI user. You’ve heard about Fabric and it seems cool, but what’s in it for you – especially if you’re not going to start writing Python or building warehouses any time soon? In this session we’ll focus on the bits of Fabric that will make your life as a Power BI developer easier, in particular Dataflows gen2 and the new Direct Lake mode for datasets.

Bios: Nat Van Gulck I started my career over twenty years ago as a .NET Software Developer while also developing an interest for Data Visualization. This led me to work on Qlik and Tableau data projects. Then Power BI became so compelling that I had to join Microsoft. As a Cloud Solutions Architect I help customers be successful with Power BI; PBI Inspector is part of this endeavour. The adventure continues with Microsoft Fabric.

Chris Webb Chris is a member of the Fabric Customer Advisory Team at Microsoft. He has been using Microsoft BI tools for over twenty years and is a regular speaker at conferences and user groups all over the world. He blogs about Microsoft BI at https://blog.crossjoin.co.uk/

VisOps with PBI Inspector | Fabric for the Power BI User

We are excited to kick of 2024 with a double session!

Session 1: Realtime Analytics in Microsoft Fabric with Devang Shah:

A LinkedIn User: “Streams come to die in a data lake”. With Fabric Real-time Analytics, you are empowered to capture, analyse, visualize and actionize your data streams in near-real time and eventually putting it to rest in a lake. Join this demo-rich and example-filled session to get answers to the fundamental questions: “Why Real-time Analytics?”, “What is it?” and “Where can I use it?”.

About Devang: Devang is a Principal Program Manager at Microsoft in Fabric Customer Advisory Team focusing on evangelization, and adoption of messaging and real-time analytics suite of products across EMEA region.

Session 2: PBI Inspector with Nat Van Gulck:

Testing the visual layer of Power BI reports has usually been a manual task. Thanks to the new Power BI Project file format however there is an opportunity to automatically check for visuals' issues around accessibility, consistency and performance and thereby align with your organizations Power BI Centre of Excellence guidelines. In this session we will see how PBI Inspector, an open-source community tool, supports this process.

About Nat: I started my career over twenty years ago as a .NET Software Developer while also developing an interest for Data Visualization. This led me to work on Qlik and Tableau data projects. Then Power BI became so compelling that I had to join Microsoft. As a Cloud Solutions Architect I help customers be successful with Power BI; PBI Inspector is part of this endeavour. The adventure continues with Microsoft Fabric.

PBI Inspector GitHub repository -- https://github.com/NatVanG/PBI-Inspector Tutorial: PBI Inspector as part of an Azure DevOps pipeline: https://learn.microsoft.com/en-us/power-bi/developer/projects/projects-build-pipelines

LATEST UPDATES Before the main session starts, James and Prathy will present the latest updates to Microsoft Fabric and Power BI

OPEN DATA CHALLENGE: This is your opportunity to showcase your talent using Microsoft Fabric using Open Data. This month the theme for the challenge is Aviation data:

https://atmdata.github.io/sources/

You'll have the opportunity to present a short (max 5 minute) presentation to show road report and and other aspects of Microsoft Fabric used in creating your solution. Please bring a laptop to present, or contact us in advance.

For extra points, see if you can incorporate Realtime Streaming Data into your presentation!

VENUE: We will meet at the Nigel Frank Offices at 60 Great Tower Street. If you've not been there before, it's just a little further up the street from Brewdog. Google Maps link to help you find the building: https://goo.gl/maps/GRZWr5LfScZhApu6A

London Fabric User Group - Realtime Analytics in Fabric and PBI Inspector

Join us for a 3-part workshop series meant to introduce you to R and RStudio. We will walk through the basics of how to handle data in R and how you can create data visualizations!

Please note: There is one meetup link per workshop, please register for each workshop separately.

  1. Introduction to R and RStudio Date: January 25th, 2024 at 12:00-1:00pm

In this session, Reiko will go through R vocabulary; creating clear, organized scripts; using GitHub repositories for sharing scripts; creating a project; importing data; and troubleshooting.

  1. Data visualization (Part I) Date: February 1st, 2024 at 12:00-1:00pm

In this session, Reiko will introduce R Markdown and the package ggplot2. You will see how to create the following data visualizations:

  • histogram
  • scatter plot
  • line plot
  • bar plot

  • Data visualization (Part II) Date: February 22nd, 2024 at 12:00-1:00pm

In this session, Reiko will continue the topic of data visualizations. Using bar plots, you will see how to change colors and overall look and feel of a plot as well as exploring more plot types and how to save our masterpieces!

Introduction to R and RStudio

We are very happy to welcome you after the summer break to the 26th edition of PyData Trójmiasto prepared together with Nike!

Great talks, networking, pizza, and cool gifts included!;)

When: 20th September at 17:30

Where: the event is hosted by Nike at Nike PTC Office, Olivia Business Centre!

[Registration - obligatory to enter] Please register officially HERE

The capacity is limited. Be sure to save your seat!

Agenda:

17:30-17:45 - Gathering and Welcome

17:45-18:15 - Nike Office Tour

18:15-18:20 - Opening

18:20-19:10 - Fine-tuning text-to-image generative AI models

19:10-20:00 - Introducing Optuna – next-generation hyperparameter

20:00-21:00 - Networking & Pizza

About Nike and the talks:

NIKE, Inc. is a technology company. From developing products, managing big data, and providing leading-edge engineering and systems support, our teams at NIKE Global Technology reimagine the future at the confluence of tech and sport.

Fine-tuning text-to-image generative AI models by Maciej Mackiewicz

The presentation will cover how to fine-tune a Stable Diffusion text-to-image model on custom data. Besides regular fine tuning, we will go over Low Rank Adaptation method that can speed up the training time and greatly reduce the size of fine-tuned artefact.

About Maciej:

Senior Machine Learning Engineer at Nike with seven years of experience in software development, machine learning engineering and MLOps. Alumni of Big Data studies at Warsaw School of Economics. Most passionate about working with text data.

Introducing Optuna – next-generation hyperparameter optimization by Mateusz Serocki

The presentation will be about the Optuna package that allows you to tune model hyperparameters in optimal time. We will compare that solution to other most common approaches. During that part, we will also see how Optuna can visualize results and we will talk about the interpretation of those visualization. As a final step, you will learn how to use Optuna for tunning neural network structure and optimization algorithm.

About Mateusz:

I’m Senior Machine Learning Engineer at NIKE. Throughout my carrier I was working on various topics, mostly demand forecast, classification, and regression problems. I’m fully focusing on a deep understanding of math that stands behind the machine learning algorithms.

In addition, be welcome to join our new discord group!

PyData Trójmiasto x Nike #26 [Nike PTC Office]

Agenda:

• 18:30 - Opening doors of the venue

• 19:00 - Welcome to PyBerlin! // Organisers

• 19:10 - Welcome from the host!

• 19:20 - Autometrics-py: the story behind the module // Nele Uhlemann The autometrics project consist of multiple languages to enable function-level metrics around the latency, request - and error rate of each function in your code base. A central aspect of the development is to define a developer friendly approach towards observability and therefore provide tools to work directly in the code base, the tooltips and the developer‘s IDE. I started together with a co-worker the Python module. In the presentation I will give some insights in design decisions as well as difficulties we had during the development. We will see how the module and tools around autometrics-py can be used.

Speaker's bio: Nele Uhlemann is a Developer Advocate at Fiberplane. Her passion is enabling collaboration among multiple stakeholders that are involved in building and running Software. Switching sides from application development to infrastructure topics she understands the challenges to enable knowledge sharing to run systems more efficiently.

• 19:50 - Short break

• 20:20 - Techniques for Terrible Leadership // Gys Muller These days there is plenty of excellent advice for being a great technical leader. However, aggregating, applying, and relating to this advice can often be challenging. Instead it might be easier to explore the opposite by applying the inversion mental model. What if your goal was to be a terrible technical leader? What techniques would you follow? This talk will show how to maximise your chances of being the worst leader you can be, in order to identify behaviours and actions that should in reality be avoided.

Speaker's bio: Gys loves building things with fellow makers. Over the span of his career he has worked on a range of hardware and software products across a variety of industries. Initially starting out in mechatronics engineering, he pivoted to software development after co-founding a computer vision company. For the past 6 years he has been the CTO of OfferZen, helping to grow the team and developing the product that connects developers with opportunities to build an awesome future. OfferZen’s journey of going from zero to startup, to scale-up, has allowed him to gain valuable experience about people, leadership, and strategy.

• 20:50 - HoloViz: Visualization and Interactive Dashboards in Python // Jean-Luc Stevens HoloViz is an open-source high-level Python visualization ecosystem that gives you the superpower to satisfy all your data visualization needs. In this talk, you will learn how to build visualizations easily even for big and multidimensional data, how to turn nearly any notebook into a deployable dashboard, and how to build interactive drill-down exploratory tools for your data and models without having to run a web-technology software development project. You will also learn how to turn your dashboard into WebAssembly and run your dashboard entirely in the browser with the magic of Pyodide.

Speaker's bio: Jean-Luc Stevens is a Senior Software Engineer who has been building custom, open-source visualization and analysis solutions for clients since joining Anaconda in 2015. Prior to joining the company, Dr. Stevens received a Ph.D. in Computational Neuroscience from the University of Edinburgh where he studied the dynamics of neural activity in the mammalian visual system. He now maintains the HoloViews visualization library, originally written as part of his doctoral thesis, which supports the rich interactive data visualization at the core of the HoloViz ecosystem.

• 21:20 - Closing session // Organisers

This event will be only in-person. Please check our Code of Conduct and official health regulation in Berlin before coming. If you feel some signs of sickness, please consider skipping this event and attending another time. We will have plenty of events in different formats in the future. Looking forward seeing you all soon!

PyBerlin 41 - ☀️☀️ Summer event ☀️☀️
Brian T. O’Neill – host , Manav Misra – Chief Data and Analytics Officer @ Regions Bank

Today, I chat with Manav Misra, Chief Data and Analytics Officer at Regions Bank. I begin by asking Manav what it was like to come in and implement a user-focused mentality at Regions, driven by his experience in the software industry. Manav details his approach, which included developing a new data product partner role and using effective communication to gradually gain trust and cooperation from all the players on his team. 

Manav then talks about how, over time, he solidified a formal framework for his team to be trained to use this approach and how his hiring is influenced by a product orientation. We also discuss his definition of data product at Regions, which I find to be one of the best I’ve heard to date. Today, Region Bank’s data products are delivering tens of millions of dollars in additional revenue to the bank. Given those results, I also dig into the role of design and designers to better understand who is actually doing the designing of Regions’ data products to make them so successful. Later, I ask Manav what it’s like when designers and data professionals work on the same team and how UX and data visualization design are handled at the bank. 

Towards the end, Manav shares what he has learned from his time at Regions and what he would implement in a new organization if starting over. He also expounds on the importance of empowering his team to ask customers the right questions and how a true client/stakeholder partnership has led to Manav’s most successful data products.

Highlights / Skip to:

Brief history of decision science and how it influenced the way data science and analytics work has been done (and unfortunately still is in many orgs) (1:47) Manav’s philosophy and methods for changing the data science culture at Regions Bank to being product and user-driven (5:19) Manav talks about the size of his team and the data product role within the team as well as what he had to do to convince leadership to buy in to the necessity of the data product partner role (10:54) Quantifying and measuring the value of data products at Regions and some of his results (which include tens of millions of dollars in additional revenue) (13:05) What’s a “data product” at Regions? Manav shares his definition (13:44) Who does the designing of data products at Regions? (17:00) The challenges and benefits of having a team comprised of both designers and data scientists (20:10) Lessons Manav has learned from building his team and culture at Regions (23:09) How Manav coaches his team and gives them the confidence to ask the right questions (27:17) How true partnership has led to Manav’s most successful data products (31:46)

Quotes from Today’s Episode Re: how traditional, non-product oriented enterprises do data work: “As younger people come out of data science programs…that [old] culture is changing. The folks coming into this world now are looking to make an impact and then they want to see what this can do in the real world.” — Manav 

On the role of the Data Product Partner: “We brought in people that had both business knowledge as well as the technical knowledge, so with a combination of both they could talk to the ‘Internal customers,’ of our data products, but they could also talk to the data scientists and our developers and communicate in both directions in order to form that bridge between the two.” — Manav

“There are products that are delivering tens of millions of dollars in terms of additional revenue, or stopping fraud, or any of those kinds of things that the products are designed to address, they’re delivering and over-delivering on the business cases that we created.” — Manav 

“The way we define a data product is this: an end-to-end software solution to a problem that the business has. It leverages data and advanced analytics heavily in order to deliver that solution.” — Manav 

“The deployment and operationalization is simply part of the solution. They are not something that we do after; they’re something that we design in from the start of the solution.” — Brian 

“Design is a team sport. And even if you don’t have a titled designer doing the work, if someone is going to use the solution that you made, whether it’s a dashboard, or report, or an email, or notification, or an application, or whatever, there is a design, whether you put intention behind it or not.” — Brian

“As you look at interactive components in your data product, which are, you know, allowing people to ask questions and then get answers, you really have to think through what that interaction will look like, what’s the best way for them to get to the right answers and be able to use that in their decision-making.” — Manav 

“I have really instilled in my team that tools will come and go, technologies will come and go, [and so] you’ll have to have that mindset of constantly learning new things, being able to adapt and take on new ideas and incorporate them in how we do things.” — Manav

Links Regions Bank: https://www.regions.com/ LinkedIn: https://www.linkedin.com/in/manavmisra/

Analytics Dashboard Data Science DataViz
Experiencing Data w/ Brian T. O’Neill (AI & data product management leadership—powered by UX design)
Mico Yuk – Co-Founder @ Data Storytelling Academy , Allen Hillery – editor @ Nightingale

You know me — I love community! Being a part of the BI community has changed my life and it can change your too for the better if you choose the right community, and understand how to use it to your advantage. Listen and learn.

Today's guest is Allen Hillery, editor of Nightingale, a data visualization society journal. Allen describes why community is important and what you can do to give and take within the community. Recently, he interviewed me and wrote a very popular article on Medium titled, "Mico Yuk on the Importance of Community and the Paradigm Shift in Business Intelligence."

In this episode, you'll learn: [09:25] Allen's Background: Writer, editor, and adjunct professor passionate about storytelling with data. [10:40] Data Business Communities: First, there were not enough, now why there's too many to choose from. [11:03] Priorities Put in Place: Passing of family members led to self-discovery and fulfillment through data storytelling journey. For full show notes, and the links mentioned visit: bibrainz.com/podcast/44 Sponsor The next BI Data Storytelling Mastery Accelerator 3-Day Live workshop is live! Many BI teams are still struggling to deliver consistent, high-engaging analytics their users love. At the end of three days, you'll leave with a clear BI delivery action plan. Register today!   Enjoyed the Show?  Please leave us a review on iTunes.    

Analytics BI DataViz
Analytics on Fire

Introduces basic concepts in probability and statistics to data science students, as well as engineers and scientists Aimed at undergraduate/graduate-level engineering and natural science students, this timely, fully updated edition of a popular book on statistics and probability shows how real-world problems can be solved using statistical concepts. It removes Excel exhibits and replaces them with R software throughout, and updates both MINITAB and JMP software instructions and content. A new chapter discussing data mining—including big data, classification, machine learning, and visualization—is featured. Another new chapter covers cluster analysis methodologies in hierarchical, nonhierarchical, and model based clustering. The book also offers a chapter on Response Surfaces that previously appeared on the book’s companion website. Statistics and Probability with Applications for Engineers and Scientists using MINITAB, R and JMP, Second Edition is broken into two parts. Part I covers topics such as: describing data graphically and numerically, elements of probability, discrete and continuous random variables and their probability distributions, distribution functions of random variables, sampling distributions, estimation of population parameters and hypothesis testing. Part II covers: elements of reliability theory, data mining, cluster analysis, analysis of categorical data, nonparametric tests, simple and multiple linear regression analysis, analysis of variance, factorial designs, response surfaces, and statistical quality control (SQC) including phase I and phase II control charts. The appendices contain statistical tables and charts and answers to selected problems. Features two new chapters—one on Data Mining and another on Cluster Analysis Now contains R exhibits including code, graphical display, and some results MINITAB and JMP have been updated to their latest versions Emphasizes the p-value approach and includes related practical interpretations Offers a more applied statistical focus, and features modified examples to better exhibit statistical concepts Supplemented with an Instructor's-only solutions manual on a book’s companion website Statistics and Probability with Applications for Engineers and Scientists using MINITAB, R and JMP is an excellent text for graduate level data science students, and engineers and scientists. It is also an ideal introduction to applied statistics and probability for undergraduate students in engineering and the natural sciences.

data data-science data-science-tasks statistics AI/ML Big Data Data Science
Dr. Amar Sahay – author

Business Analytics: A Data-Driven Decision Making Approach for Business-Part I,/i> provides an overview of business analytics (BA), business intelligence (BI), and the role and importance of these in the modern business decision-making. The book discusses all these areas along with three main analytics categories: (1) descriptive, (2) predictive, and (3) prescriptive analytics with their tools and applications in business. This volume focuses on descriptive analytics that involves the use of descriptive and visual or graphical methods, numerical methods, as well as data analysis tools, big data applications, and the use of data dashboards to understand business performance. The highlights of this volume are: Business analytics at a glance; Business intelligence (BI), data analytics; Data, data types, descriptive analytics; Data visualization tools; Data visualization with big data; Descriptive analytics-numerical methods; Case analysis with computer applications.

data data-science business-intelligence Analytics BI Big Data Data Analytics DataViz