talk-data.com talk-data.com

Topic

Data Science

machine_learning statistics analytics

1516

tagged

Activity Trend

68 peak/qtr
2020-Q1 2026-Q1

Activities

1516 activities · Newest first

In this podcast episode, we talked with Lavanya Gupta about Building a Strong Career in Data. About the Speaker: Lavanya is a Carnegie Mellon University (CMU) alumni of the Language Technologies Institute (LTI). She works as a Sr. AI/ML Applied Associate at JPMorgan Chase in their specialized Machine Learning Center of Excellence (MLCOE) vertical. Her latest research on long-context evaluation of LLMs was published in EMNLP 2024.

In addition to having a strong industrial research background of 5+ years, she is also an enthusiastic technical speaker. She has delivered talks at events such as Women in Data Science (WiDS) 2021, PyData, Illuminate AI 2021, TensorFlow User Group (TFUG), and MindHack! Summit. She also serves as a reviewer at top-tier NLP conferences (NeurIPS 2024, ICLR 2025, NAACL 2025). Additionally, through her collaborations with various prestigious organizations, like Anita BOrg and Women in Coding and Data Science (WiCDS), she is committed to mentoring aspiring machine learning enthusiasts.

In this episode, we talk about Lavanya Gupta’s journey from software engineer to AI researcher. She shares how hackathons sparked her passion for machine learning, her transition into NLP, and her current work benchmarking large language models in finance. Tune in for practical insights on building a strong data career and navigating the evolving AI landscape.

🕒 TIMECODES 00:00 Lavanya’s journey from software engineer to AI researcher 10:15 Benchmarking long context language models 12:36 Limitations of large context models in real domains 14:54 Handling large documents and publishing research in industry 19:45 Building a data science career: publications, motivation, and mentorship 25:01 Self-learning, hackathons, and networking 33:24 Community work and Kaggle projects 37:32 Mentorship and open-ended guidance 51:28 Building a strong data science portfolio 🔗 CONNECT WITH LAVANYALinkedIn -   / lgupta18  🔗 CONNECT WITH DataTalksClub Join the community - https://datatalks.club/slack.html Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/... Check other upcoming events - https://lu.ma/dtc-events LinkedIn -   / datatalks-club   Twitter -   / datatalksclub   Website - https://datatalks.club/

Data science isn't just about models and code—it's about people, connections, and shared knowledge. From online forums to in-person hackathons, communities play a crucial role in shaping careers and innovations. In this episode, we're joined by Yujian Tang — an expert in building and fostering data-driven communities — to discuss how these spaces can help you grow and why developer advocacy is a great role for people passionate about both building and fostering community. Whether you're looking to expand your network, break into data science, or grow your own community, this episode is packed with actionable insights from someone who has done it all. What You'll Learn: How getting involved in different communities can accelerate your career and open up unexpected opportunities The key differences between various data communities and how to find the right fit What it takes to build and nurture a thriving community of your own The evolving role of the Developer Advocate in growing data product visibility through community.   Register for free to be part of the next live session: https://bit.ly/3XB3A8b   Interested in learning more about GenAI? 👉 https://lu.ma/oss4ai   Follow us on Socials: LinkedIn YouTube Instagram (Mavens of Data) Instagram (Maven Analytics) TikTok Facebook Medium X/Twitter

Agentic AI is here, but what is it? What are the differences between the traditional LLMs and this new agentic AI we're hearing about? With AI systems making autonomous decisions, driving analytics, and reshaping data strategies, what does this mean for analysts? We're joined by Vin Vashishta, CEO at V Squared and an expert in AI strategy and data science. Vin's book, From Data to Profit, lays out a roadmap for turning AI and analytics into real business value.  AI isn't just a tool anymore; it's becoming a collaborator. How should we think about adapting? Don't miss his insights in this show! What You'll Learn: How Agentic AI will redefine the role of analysts in analytics. What makes an AI 'agent' different from a traditional LLM? Why knowledge graphs are the key to AI's next leap forward. How to future-proof your career in analytics.   Register for free to be part of the next live session: https://bit.ly/3XB3A8b   Follow us on Socials: LinkedIn YouTube Instagram (Mavens of Data) Instagram (Maven Analytics) TikTok Facebook Medium X/Twitter

Applied Machine Learning for Data Science Practitioners

A single-volume reference on data science techniques for evaluating and solving business problems using Applied Machine Learning (ML). Applied Machine Learning for Data Science Practitioners offers a practical, step-by-step guide to building end-to-end ML solutions for real-world business challenges, empowering data science practitioners to make informed decisions and select the right techniques for any use case. Unlike many data science books that focus on popular algorithms and coding, this book takes a holistic approach. It equips you with the knowledge to evaluate a range of techniques and algorithms. The book balances theoretical concepts with practical examples to illustrate key concepts, derive insights, and demonstrate applications. In addition to code snippets and reviewing output, the book provides guidance on interpreting results. This book is an essential resource if you are looking to elevate your understanding of ML and your technical capabilities, combining theoretical and practical coding examples. A basic understanding of using data to solve business problems, high school-level math and statistics, and basic Python coding skills are assumed. Written by a recognized data science expert, Applied Machine Learning for Data Science Practitioners covers essential topics, including: Data Science Fundamentals that provide you with an overview of core concepts, laying the foundation for understanding ML. Data Preparation covers the process of framing ML problems and preparing data and features for modeling. ML Problem Solving introduces you to a range of ML algorithms, including Regression, Classification, Ranking, Clustering, Patterns, Time Series, and Anomaly Detection. Model Optimization explores frameworks, decision trees, and ensemble methods to enhance performance and guide the selection of the most effective model. ML Ethics addresses ethical considerations, including fairness, accountability, transparency, and ethics. Model Deployment and Monitoring focuses on production deployment, performance monitoring, and adapting to model drift.

Are you ready to level up your analytics game and tackle the challenges that come with data-heavy projects? In this episode, Harpreet Sahota, a data science leader with years of experience helping analysts and teams thrive, shares actionable insights and strategies for staying ahead in the fast-evolving world of data. Harpreet will help you develop a practical mindset to tackle real-world challenges and build the confidence to lead impactful projects. From cleaning messy datasets, to deciding between building or buying a solution, to training a computer vision model, Harpreet is here to share his expertise. Whether you're an aspiring data analyst or a seasoned professional, this episode will equip you with the skills and clarity to succeed. What You'll Learn: Data Cleaning for Any Data Type: Proven techniques to clean and prepare your data for analysis. Training a Computer Vision Model: What to consider before you start and how to ensure success. Build vs. Buy for LLMs: When to create your own solution and when to leverage existing tools. Setting Yourself Up for Success as an Analyst: Strategies to stand out and make your work impactful.   Register for free to be part of the next live session: https://bit.ly/3XB3A8b   Interested in learning more from Harpreet? Connect with him on LinkedIn   Follow us on Socials: LinkedIn YouTube Instagram (Mavens of Data) Instagram (Maven Analytics) TikTok Facebook Medium X/Twitter

Architecting Power BI Solutions in Microsoft Fabric

This book is a comprehensive guide to building sophisticated and robust Power BI solutions that solve common data problems effectively. Written with hands-on professionals in mind, it provides essential insights and practical advice to help you choose the right tools and approaches for any BI task. Readers will learn to create performant, secure, and innovative business intelligence systems. What this Book will help me do Identify the scenarios where each Power BI component fits best. Apply secure and performance-conscious design principles when building BI solutions. Leverage Microsoft Fabric and other advanced integrations to maximize Power BI's capabilities. Implement AI-powered features such as Copilot and predictive modeling in Power BI. Facilitate collaboration and governance using Power BI's advanced features. Author(s) Nagaraj Venkatesan has over 17 years of professional expertise in data platform technologies and business intelligence tools. Through a rich career in data solution architecture, he has mastered the art of designing efficient and reliable Power BI implementations. This book reflects his passion for empowering professionals to make the most of Power BI. Who is it for? If you are a solution architect, data engineer, or Power BI report developer looking to elevate your skills in designing optimized Power BI solutions, this book is for you. Business analysts and data scientists can also benefit immensely from the book's coverage of self-service BI and data science integration. Some familiarity with Power BI will enhance your learning experience, but newcomers eager to learn will also find it invaluable.

podcast_episode
by Cam Mura , Philip Bourne (UVA School of Data Science)

Here we explore how data science is revolutionizing our understanding of protein structures, with a special focus on the exciting developments in protein folding and evolution. We’re joined by two experts in the field: Philip Bourne, the founding dean of the UVA School of Data Science, and Cam Mura, a biomolecular data scientist. From new tools like DeepUrfold to the future of biomedical applications, Bourne and Mura provide a unique look into how cutting-edge technology is transforming the world of molecular biology.

Megan Bowers took an unconventional path to break into the data world. Starting from a self-guided Data Science Bootcamp, she shared her journey through blogging and gained millions of views, and then BOOM! Job offers and monetization opportunities flooded. This is her story. 📌 Interested in blogging for my publication? Get on this interest list: https://tally.so/r/3l4xQW 💌 Join 10k+ aspiring data analysts & get my tips in your inbox weekly 👉 https://www.datacareerjumpstart.com/newsletter 🆘 Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training 👉 https://www.datacareerjumpstart.com/training 👩‍💻 Want to land a data job in less than 90 days? 👉 https://www.datacareerjumpstart.com/daa 👔 Ace The Interview with Confidence 👉 https://www.datacareerjumpstart.com/interviewsimulator ⌚ TIMESTAMPS 00:00 - Introduction 03:28 - Gaining traction and recognition through blogging. 07:08 - Career Growth and transition to Alteryx. 14:18 -  Leveraging and advertising your domain expertise. 19:25 - What is a Data Journalist? 22:21 - Writing content. 24:29 - What is Alteryx? 🔗 CONNECT WITH MEGAN 🎥 YouTube Podcast Channel: https://www.youtube.com/playlist?list=PLfSLx4WE4q501UZjL3Hx-DiS4zyeePEN2 🤝 LinkedIn: https://www.linkedin.com/in/megandibble1/ 📸 Instagram: https://www.instagram.com/alteryx/ 💻 Alteryx Website: https://www.alteryx.com/ 🔗 CONNECT WITH AVERY 🎥 YouTube Channel: https://www.youtube.com/@averysmith 🤝 LinkedIn: https://www.linkedin.com/in/averyjsmith/ 📸 Instagram: https://instagram.com/datacareerjumpstart 🎵 TikTok: https://www.tiktok.com/@verydata 💻 Website: https://www.datacareerjumpstart.com/ Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

--- The GovExperts is a Data Points mini-series spotlighting the brilliant minds at GovEx who are shaping the future of public sector data work. Today we’re talking to Dr. Bertran de Lis, GovEx’s Director of Research and Analytics, about her path from astrophysics to data science, what it means for cities to adopt a “Data-Driven Culture,” and her groundbreaking work on the Coronavirus Resource Center and the new City Data Explorer.

--- Learn more about GovEx --- Fill out our listener survey!

Ready to land your dream analytics job? In this episode, we sit down with Nick Singh—author, career coach, and Founder of DataLemur—to share insider tips for crushing analyst interviews. Nick will guide us through mock interview questions to sharpen your skills and discuss his portfolio project using Spotify hip-hop data that grabbed recruiters' attention. If you're preparing for interviews, building projects, or looking for real-world inspiration to stand out as an analyst, this episode is for you. Whether you're gearing up for interviews, brainstorming your next portfolio project, or seeking practical advice to shine as an analyst, this episode will give you practical tips you can implement immediately. What You'll Learn: Proven tips and tricks to succeed in analyst interviews. How to answer common interview questions with confidence. Why portfolio projects matter and how Nick's Spotify hip-hop data project helped him stand out.   Register for free to be part of the next live session: https://bit.ly/3XB3A8b   Interested in learning more from Nick? Check out his book, Ace the Data Science Interview, and DataLemur!   Follow us on Socials: LinkedIn YouTube Instagram (Mavens of Data) Instagram (Maven Analytics) TikTok Facebook Medium X/Twitter

The modern data landscape is exploding with complexity. This session explores AI-first capabilities in BigQuery to simplify the discovery, preparation, and management of data. Learn about BigQuery Studio, BigQuery data canvas, and BigQuery data preparation for data science workloads. Discover how to connect and unify data from various sources for deeper insights and better business outcomes, with examples from Standard Industries, and how to prepare data for analytics and insights with AI-first data tools

In this session, we’ll show how Vertex AI and BigQuery make the process of integrating data into AI models as easy as 1, 2, 3. You’ll learn how to seamlessly integrate your data estate at every stage of AI development, from exploration and feature engineering to model training and deployment. We’ll also introduce our Data Science Agent and Vertex AI Feature Store 2.0, and how you can accelerate your innovation velocity with AI.

Are you a data scientist or developer using Python to build AI models and generative AI applications? Learn how BigQuery can supercharge Python data science workflows with capabilities that give you the productivity of Python and allow BigQuery to handle core processing. Offloading Python processing enables large-scale data analysis and seamless production deployments along the data-to-AI journey. Find out how Deutsche Telekom modernized their machine learning platform with a radically simplified infrastructure and increased developer productivity.

3D Data Science with Python

Our physical world is grounded in three dimensions. To create technology that can reason about and interact with it, our data must be 3D too. This practical guide offers data scientists, engineers, and researchers a hands-on approach to working with 3D data using Python. From 3D reconstruction to 3D deep learning techniques, you'll learn how to extract valuable insights from massive datasets, including point clouds, voxels, 3D CAD models, meshes, images, and more. Dr. Florent Poux helps you leverage the potential of cutting-edge algorithms and spatial AI models to develop production-ready systems with a focus on automation. You'll get the 3D data science knowledge and code to: Understand core concepts and representations of 3D data Load, manipulate, analyze, and visualize 3D data using powerful Python libraries Apply advanced AI algorithms for 3D pattern recognition (supervised and unsupervised) Use 3D reconstruction techniques to generate 3D datasets Implement automated 3D modeling and generative AI workflows Explore practical applications in areas like computer vision/graphics, geospatial intelligence, scientific computing, robotics, and autonomous driving Build accurate digital environments that spatial AI solutions can leverage Florent Poux is an esteemed authority in the field of 3D data science who teaches and conducts research for top European universities. He's also head professor at the 3D Geodata Academy and innovation director for French Tech 120 companies.

Learn how to speed up popular data science libraries such as pandas and scikit-learn by up to 50x in Google Colab using pre-installed NVIDIA RAPIDS Python libraries. Boost both speed and scale for your workflows by simply selecting a GPU runtime in Colab – no code changes required. In addition, Gemini helps Colab users incorporate GPUs and generate pandas code from simple natural language prompts.

This Session is hosted by a Google Cloud Next Sponsor.
Visit your registration profile at g.co/cloudnext to opt out of sharing your contact information with the sponsor hosting this session.

Discover how Target modernized its MLOps workflows using Ray and Vertex AI to build scalable ML applications.  This session will cover key strategies for optimizing model performance, ensuring security and compliance, and fostering collaboration between data science and platform teams. Whether you’re looking to streamline model deployment, enhance data access, or improve infrastructure management in a hybrid setup, this session provides practical insights and guidance for integrating Ray and Vertex AI into your MLOps roadmap.

Data Insight Foundations: Step-by-Step Data Analysis with R

This book is an essential guide designed to equip you with the vital tools and knowledge needed to excel in data science. Master the end-to-end process of data collection, processing, validation, and imputation using R, and understand fundamental theories to achieve transparency with literate programming, renv, and Git--and much more. Each chapter is concise and focused, rendering complex topics accessible and easy to understand. Data Insight Foundations caters to a diverse audience, including web developers, mathematicians, data analysts, and economists, and its flexible structure allows enables you to explore chapters in sequence or navigate directly to the topics most relevant to you. While examples are primarily in R, a basic understanding of the language is advantageous but not essential. Many chapters, especially those focusing on theory, require no programming knowledge at all. Dive in and discover how to manipulate data, ensure reproducibility, conduct thorough literature reviews, collect data effectively, and present your findings with clarity. What You Will Learn Data Management: Master the end-to-end process of data collection, processing, validation, and imputation using R. Reproducible Research: Understand fundamental theories and achieve transparency with literate programming, renv, and Git. Academic Writing: Conduct scientific literature reviews and write structured papers and reports with Quarto. Survey Design: Design well-structured surveys and manage data collection effectively. Data Visualization: Understand data visualization theory and create well-designed and captivating graphics using ggplot2. Who this Book is For Career professionals such as research and data analysts transitioning from academia to a professional setting where production quality significantly impacts career progression. Some familiarity with data analytics processes and an interest in learning R or Python are ideal.

UVA School of Data Science graduates pursue many career paths, including government, health care, technology, retail, and... finance. In this episode, we hear from two UVA data science alumni who put their data science degrees to work every day in their roles at Octus, a financial services company that uses data to provide insights to its clients in banking and legal services.

They discuss the integration of AI into various industries, the challenges of information overload, and the role of human expertise.We welcome Charu Rawat and Yihnew Eshetu, who earned their M.S. in Data Science degrees from UVA in 2019 and 2021, respectively, and Ben Rogers, vice president of AI and advanced analytics at Permira. 

In the retail industry, data science is not just about crunching numbers—it's about driving business impact through well-designed experiments. A-B testing in a physical store setting presents unique challenges that require careful planning and execution. How do you balance the need for statistical rigor with the practicalities of store operations? What role does data science play in ensuring that test results lead to actionable insights?  Philipp Paraguya is the Chapter Lead for Data Science at Aldi DX. Previously, Philipp studied applied mathematics and computer science and has worked as a BI and advanced analytics consultant in various industries and projects since graduating. Due to his background as a software developer, he has a strong connection to classic software engineering and the sensible use of data science solutions. In the episode, Adel and Philipp explore the intricacies of A-B testing in retail, the challenges of running experiments in brick-and-mortar settings, aligning stakeholders for successful experimentation, the evolving role of data scientists, the impact of genAI on data workflows, and much more. Links Mentioned in the Show: Aldi DXConnect with PhilippCourse: Customer Analytics and A/B Testing in PythonRelated Episode: Can You Use AI-Driven Pricing Ethically? with Jose Mendoza, Academic Director & Clinical Associate Professor at NYUSign up to attend RADAR: Skills Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business