Information session discussing Magnimind Academy's mentor-led data science internship program.
talk-data.com
Topic
Data Science
1516
tagged
Activity Trend
Top Events
Gain insights into emerging trends, evolving roles, and the future direction of data science in an increasingly AI-powered world.
A talk about managing anxiety, burnout, and career decisions in tech. Kim Scott discusses emotional agility, thought models, and coaching that helped her align with her values, stay calm, and make bold career moves. The session includes a personal journey through data science and related roles, with practical tools to stay focused and authentic.
At Target, creating relevant guest experiences at scale takes more than great creative — it takes great data. In this session, we’ll explore how Target’s Data Science team is using first-party data, machine learning, and GenAI to personalize marketing across every touchpoint.
You’ll hear how we’re building intelligence into the content supply chain, turning unified customer signals into actionable insights, and using AI to optimize creative, timing, and messaging — all while navigating a privacy-first landscape. Whether it’s smarter segmentation or real-time decisioning, we’re designing for both scale and speed.
As the Chief Analytics Officer for New York City, I witnessed firsthand how data science and AI can transform public service delivery while navigating the unique challenges of government implementation. This talk will share real-world examples of successful data science initiatives in the government context, from predictive analytics for fire department risk modeling to machine learning models that improve social service targeting.
However, government data science isn't just about technical skill—it's about accountability, equity, and transparency. I'll discuss critical pitfalls including algorithmic bias, privacy concerns, and the importance of explainable AI in public decision-making.
We'll explore how traditional data science skills must be adapted for the public sector context, where stakeholders include not just internal teams but taxpayers, elected officials, and community advocates.
Whether you're a data scientist considering public service or a government professional seeking to leverage analytics, this session will provide practical insights into building data capacity that serves the public interest while maintaining democratic values and citizen trust.
As AI continues to shape human-computer interaction, there’s a growing opportunity and responsibility to ensure these technologies serve everyone, including people with communication disabilities. In this talk, I will present my ongoing work in developing a real-time American Sign Language (ASL) recognition system, and explore how integrating accessible design principles into AI research can expand both usability and impact.
The core of the talk will cover the Sign Language Recogniser project (available on GitHub), in which I used MediaPipe Studio together with TensorFlow, Keras, and OpenCV to train a model that classifies ASL letters from hand-tracking features.
I’ll share the methodology: data collection, feature extraction via MediaPipe, model training, and demo/testing results. I’ll also discuss challenges encountered, such as dealing with gesture variability, lighting and camera differences, latency constraints, and model generalization.
Beyond the technical implementation, I’ll reflect on the broader implications: how accessibility-focused AI projects can promote inclusion, how design decisions affect trust and usability, and how women in AI & data science can lead innovation that is both rigorous and socially meaningful. Attendees will leave with actionable insights for building inclusive AI systems, especially in domains involving rich human modalities such as gesture or sign.
Whether you call it wrangling, cleaning, or preprocessing, data prep is often the most expensive and time-consuming part of the analytical pipeline. It may involve converting data into machine-readable formats, integrating across many datasets or outlier detection, and it can be a large source of error if done manually. Lack of machine-readable or integrated data limits connectivity across fields and data accessibility, sharing, and reuse, becoming a significant contributor to research waste.
For students, it is perhaps the greatest barrier to adopting quantitative tools and advancing their coding and analytical skills. AI tools are available for automating the cleanup and integration, but due to the one-of-a-kind nature of these problems, these approaches still require extensive human collaboration and testing. I review some of the common challenges in data cleanup and integration, approaches for understanding dataset structures, and strategies for developing and testing workflows.
AI has the potential to transform learning, work, and daily life for millions of people, but only if we design with accessibility at the core. Too often, disabled people are underrepresented in datasets, creating systemic barriers that ripple through models and applications. This talk explores how data scientists and technologists can mitigate bias, from building synthetic datasets to fine-tuning LLMs on accessibility-focused corpora. We’ll look at opportunities in multimodal AI: voice, gesture, AR/VR, and even brain-computer interfaces, that open new pathways for inclusion. Beyond accuracy, we’ll discuss evaluation metrics that measure usability, comprehension, and inclusion, and why testing with humans is essential to closing the gap between model performance and lived experience. Attendees will leave with three tangible ways to integrate accessibility into their own work through datasets, open-source tools, and collaborations. Accessibility is not just an ethical mandate, it’s a driver of innovation, and it begins with thoughtful, human-centered data science.
In a rapidly evolving advertising landscape where data, technology, and methodology converge, the pursuit of rigorous yet actionable marketing measurement is more critical—and complex—than ever. This talk will showcase how modern marketers and applied data scientists employ advanced measurement approaches—such as Marketing Mix Modeling (frequentist and Bayesian) and robust experimental designs, including randomized control trials and synthetic control-based counterfactuals—to drive causal inference in advertising effectiveness for meaningful business impact.
The talk will also address emergent aspects of applied marketing science- namely open-source methodologies, digital commerce platforms and artificial intelligence usage. Innovations from industry giants like Google and Meta, as well as open-source communities exemplified by PyMC-Marketing, have democratized access to advancement in methodologies. The emergence of digital commerce platforms such as Amazon and Walmart and the rich data they bring forward is transforming how customer journeys and campaign effectiveness are measured across channels. Artificial Intelligence is accelerating every facet of the data science workflow, streamlining processes like coding, modeling, and rapid prototyping (“vibe coding”) to enabling the integration of neural networks and deep learning techniques into traditional MMM toolkits. Collectively, these provide new and easy ways of quick experimentation and learning of complex nonlinear dynamics and hidden patterns in marketing data
Bringing these threads together, the talk will show how Ovative Group—a media and marketing technology firm—integrates domain expertise, open-source solutions, strategic partnerships, and AI automation into comprehensive measurement solutions. Attendees will gain practical insights on bridging academic rigor with business relevance, empowering careers in applied data science, and helping organizations turn marketing analytics into clear, actionable strategies.
As healthcare organizations accelerate their adoption of AI and data-driven systems, the challenge lies not only in innovation but in responsibly scaling these technologies within clinical and operational workflows. This session examines the technical and governance frameworks required to translate AI research into reliable and compliant real-world applications. We will explore best practices in model lifecycle management, data quality assurance, bias detection, regulatory alignment, and human-in-the-loop validation, grounded in lessons from implementing AI solutions across complex healthcare environments. Emphasizing cross-functional collaboration among clinicians, data scientists, and business leaders, the session highlights how to balance technical rigor with clinical relevance and ethical accountability. Attendees will gain actionable insights into building trustworthy AI pipelines, integrating MLOps principles in regulated settings, and delivering measurable improvements in patient care, efficiency, and organizational learning.
Fairness and inclusivity are critical challenges as AI systems influence decisions in healthcare, finance, and everyday life. Yet, most fairness frameworks are developed in limited contexts, often overlooking the data diversity needed for global reliability.
In this talk, Tito Osadebey shares lessons from his research on bias in computer vision models to highlight where fairness efforts often fall short and how data professionals can address these gaps. He’ll outline practical principles for building and evaluating inclusive AI systems, discuss pitfalls that lead to hidden biases, and explore what “fairness” really means in practice.
Tito Osadebey is an AI researcher and data scientist whose work focuses on fairness, inclusivity, and ethical representation in AI systems. He recently published a paper on bias in computer vision models using Nigerian food images, which examines how underrepresentation of the Global South affects model performance and trust.
Tito has contributed to research and industry projects spanning computer vision, NLP, GenAI and data science with organisations including Keele University, Synectics Solutions, and Unify. His work has been featured on BBC Radio, and he led a team from Keele University which secured 3rd place globally at the 2025 IEEE MetroXraine Forensic Handwritten Document Analysis Challenge.
He is passionate about making AI systems more inclusive, context-aware, and equitable bridging the gap between technical innovation and human understanding.
The future of education is being reshaped by AI-powered personalization. Traditional online learning platforms offer static content that doesn't adapt to individual needs, but new technologies are creating truly interactive experiences that respond to each learner's context, pace, and goals. How can personalized AI tutoring bridge the gap between mass education and the gold standard of one-on-one human tutoring? What if every professional could have a private tutor that understands their industry, role, and specific challenges? As organizations invest in upskilling their workforce, the question becomes: how can we leverage AI to make learning more engaging, effective, and accessible for everyone? As the Co-Founder & CEO of DataCamp, Jonathan Cornelissen has helped grow DataCamp to upskill over 10M+ learners and 2800+ teams and enterprise clients. He is interested in everything related to data science, education, and entrepreneurship. He holds a Ph.D. in financial econometrics and was the original author of an R package for quantitative finance. Yusuf Saber is a technology leader and entrepreneur with extensive experience building and scaling data-driven organizations across the Middle East. He is the Founder of Optima and a Venture Partner at COTU Ventures, with previous leadership roles at talabat, including VP of Data and Senior Director of Data Science and Engineering. Earlier in his career, he co-founded BulkWhiz and Trustious, and led data science initiatives at Careem. Yusuf holds research experience from ETH Zurich and began his career as an engineering intern at Mentor Graphics. In the episode, Richie, Jo and Yusuf explore the innovative AI-driven learning platform Optima, its unique approach to personalized education, the potential for AI to enhance learning experiences, the future of AI in education, the challenges and opportunities in creating dynamic, context-aware learning environments, and much more. Links Mentioned in the Show: Read more about the announcementTry the AI-Native Courses:Intro to SQLIntro to AI for Work New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for busines
AI and data analytics are transforming business, and your data career can’t afford to be left behind. 🎙️ In this episode of Data Career School, I sit down with Ketan Mudda, Director of Data Science & AI Solutions at Walmart, to explore how AI is reshaping retail, analytics, and decision-making—and what it means for students, job seekers, and early-career professionals in 2026.
We dive into: How AI is driving innovation and smarter decisions in retail and business Essential skills data professionals need to thrive in an AI-first world How AI tools like ChatGPT are changing the way analysts work What employers look for beyond technical expertise Strategies to future-proof your data career
Ketan also shares his journey from Credit Risk Analyst at HSBC to leading AI-driven initiatives at one of the world’s largest retailers.
Whether you’re starting your data career, exploring AI’s impact on business, or curious about analytics in action, this episode is packed with actionable insights, inspiration, and career guidance.
🎙️ Hosted by Amlan Mohanty — creator of Data Career School, where we explore AI, data analytics, and the future of work. Follow me: 📺 YouTube 🔗 LinkedIn 📸 Instagram
🎧Listen now to level up your data career!
Chapters 00:00 The Journey of Ketan Mudda05:18 AI's Transformative Impact on Industries12:49 Responsible AI Practices14:28 The Role of Education in Data Science23:18 AI and the Future of Jobs28:03 Embracing AI Tools for Success29:44 The Importance of Networking31:40 Curiosity and Continuous Learning32:50 Storytelling in Data Science Leadership36:22 Focus on AI Ethics and Change Management41:03 Learning How to Learn44:57 Identifying Problems Over Tools
Traditional subgraph isomorphism algorithms like VF2 rely on sequential tree-search that can't leverage parallel computing. This talk introduces Δ-Motif, a data-centric approach that transforms graph matching into data operations using Python's data science stack. Δ-Motif decomposes graphs into small "motifs" to reconstruct matches. By representing graphs as tabular data with RAPIDS cuDF and Pandas, we achieve 10-595X speedups over VF2 without custom GPU kernels. I'll demonstrate practical applications from social networks to quantum computing, and show when GPU acceleration provides the biggest benefits for graph analysis problems. Perfect for data scientists working with network analysis, recommendation systems, or pattern matching at scale
LLMs have a lot of hype around them these days. Let’s demystify how they work and see how we can put them in context for data science use. As data scientists, we want to make sure our results are inspectable, reliable, reproducible, and replicable. We already have many tools to help us in this front. However, LLMs provide a new challenge; we may not always be given the same results back from a query. This means trying to work out areas where LLMs excel in, and use those behaviors in our data science artifacts. This talk will introduce you to LLMs, the Chatlas packages, and how they can be integrated into a Shiny to create an AI-powered dashboard (using querychat). We’ll see how we can leverage the tasks LLMs are good at to better our data science products.
As datasets continue to grow in both size and complexity, CPU-based visualization pipelines often become bottlenecks, slowing down exploratory data analysis and interactive dashboards. In this session, we’ll demonstrate how GPU acceleration can transform Python-based interactive visualization workflows, delivering speedups of up to 50x with minimal code changes. Using libraries such as hvPlot, Datashader, cuxfilter, and Plotly Dash, we’ll walk through real-world examples of visualizing both tabular and unstructured data and demonstrate how RAPIDS, a suite of open-source GPU-accelerated data science libraries from NVIDIA, accelerates these workflows. Attendees will learn best practices for accelerating preprocessing, building scalable dashboards, and profiling pipelines to identify and resolve bottlenecks. Whether you are an experienced data scientist or developer, you’ll leave with practical techniques to instantly scale your interactive visualization workflows on GPUs.
This talk explores how AI agents integrated directly into Jupyter notebooks can help with every part of your data science work. We'll cover the latest notebook-focused agentic features in VS Code, demonstrating how they automate tedious tasks like environment management or graph styling, enhance your "scratch notebook" to sharable code, and more generally streamline data science workflows directly in notebooks.
Data science has the power to shape industries and societies. This panel will focus on empowering underrepresented groups in data science through education, access to tools, and career opportunities. Panelists will share their journeys, discuss the importance of democratizing data skills, and explore how to make the field more accessible to diverse talent.
In this show, we're joined by Sean Chandler, Director of BI at CenterWell Home Health, to explore what it really means to thrive in BI today. Sean shares his personal journey, including his move into teaching, and offers practical insights on building a career in BI, self-learning for advancement, and fostering a strong partnership between BI and data science teams. Whether you're an aspiring BI analyst, a data scientist aiming to improve collaboration, or a career changer eyeing the BI space, this episode is for you. What You'll Learn: How to successfully transition from other roles into BI, and how to know if it's the right fit for you What good collaboration between BI and data science actually looks like, and how to recognize when it's broken How self-taught skills can accelerate your BI career, even without a formal background 🤝 Follow Sean on LinkedIn! Register for free to be part of the next live session: https://bit.ly/3XB3A8b Follow us on Socials: LinkedIn YouTube Instagram (Mavens of Data) Instagram (Maven Analytics) TikTok Facebook Medium X/Twitter
In this episode of Data Skeptic's Recommender Systems series, Kyle sits down with Aditya Chichani, a senior machine learning engineer at Walmart, to explore the darker side of recommendation algorithms. The conversation centers on shilling attacks—a form of manipulation where malicious actors create multiple fake profiles to game recommender systems, either to promote specific items or sabotage competitors. Aditya, who researched these attacks during his undergraduate studies at SPIT before completing his master's in computer science with a data science specialization at UC Berkeley, explains how these vulnerabilities emerge particularly in collaborative filtering systems. From promoting a friend's ska band on Spotify to inflating product ratings on e-commerce platforms, shilling attacks represent a significant threat in an industry where approximately 4% of reviews are fake, translating to $800 billion in annual sales in the US alone. The discussion delves deep into collaborative filtering, explaining both user-user and item-item approaches that create similarity matrices to predict user preferences. However, these systems face various shilling attacks of increasing sophistication: random attacks use minimal information with average ratings, while segmented attacks strategically target popular items (like Taylor Swift albums) to build credibility before promoting target items. Bandwagon attacks focus on highly popular items to connect with genuine users, and average attacks leverage item rating knowledge to appear authentic. User-user collaborative filtering proves particularly vulnerable, requiring as few as 500 fake profiles to impact recommendations, while item-item filtering demands significantly more resources. Aditya addresses detection through machine learning techniques that analyze behavioral patterns using methods like PCA to identify profiles with unusually high correlation and suspicious rating consistency. However, this remains an evolving challenge as attackers adapt strategies, now using large language models to generate more authentic-seeming fake reviews. His research with the MovieLens dataset tested detection algorithms against synthetic attacks, highlighting how these concerns extend to modern e-commerce systems. While companies rarely share attack and detection data publicly to avoid giving attackers advantages, academic research continues advancing both offensive and defensive strategies in recommender systems security.