talk-data.com talk-data.com

Topic

GenAI

Generative AI

ai machine_learning llm

1517

tagged

Activity Trend

192 peak/qtr
2020-Q1 2026-Q1

Activities

1517 activities · Newest first

Powering Personalization with Data Science at Target with Samantha Schumacher

At Target, creating relevant guest experiences at scale takes more than great creative — it takes great data. In this session, we’ll explore how Target’s Data Science team is using first-party data, machine learning, and GenAI to personalize marketing across every touchpoint.

You’ll hear how we’re building intelligence into the content supply chain, turning unified customer signals into actionable insights, and using AI to optimize creative, timing, and messaging — all while navigating a privacy-first landscape. Whether it’s smarter segmentation or real-time decisioning, we’re designing for both scale and speed.

AI collaboration increases productivity, but costs worker autonomy and efficacy with Yuqing Ren

As organizations and individual workers increasingly adopt generative AI (GenAI) to improve productivity, there is limited understanding of how different modes of human-AI interactions affect worker experience.

In this study, we examine the ordering effect of human-AI collaboration on worker experience through a series of pre-registered laboratory and online experiments involving common professional writing tasks. We study three collaboration orders: AI-first when humans prompt AI to draft the work and then improve it, human-first when humans draft the work and ask AI to improve it, and no-AI. Our results reveal an important trade-off between worker productivity and worker experience: while workers completed the writing draft more quickly in the AI-first condition than in the human-first condition, they reported significantly lower autonomy and efficacy. This negative ordering effect affected primarily female workers, not male workers.

Furthermore, being randomly assigned to a collaboration mode increased workers’ likelihood of choosing the same mode for similar tasks in the future, especially for the human-first collaboration mode. In addition, writing products generated with the use of GenAI were longer, more complex, and required higher grade levels to comprehend. Together, our findings highlight the potential hidden risks of integrating GenAI into workflow and the imperative of designing human-AI collaborations to balance work productivity with human experiences.

Reimagining Data-Driven Decisions in Education through Critical Data Literacy with Shreepriya Dogra

Artificial Intelligence (AI) and Generative AI (GenAI) are marketed as upgrades to data-driven decision making in education, promising faster predictions, personalization, and adaptive interventions. Yet these systems do not address the fundamental problems like over-reliance on quantifiable metrics, bias, inequity, and lack of transparency, embedded in educational data practices; they amplify them.

Across platforms such as Learning Management Systems (LMS), institutional dashboards, and predictive models, what is counted as “data” remains narrow: logins, clicks, scores, demographics, and test results. Excluded are lived experiences, complex identities, and structural inequities. These omissions are not accidental; they are design choices shaped by institutional priorities and power.

Drawing on O’Neil and Broussard, this session highlights how data-driven systems risk misinterpretation, reductionism, and exclusion. Participants will engage with scenarios that demonstrate both the promises and pitfalls of triangulating educational data. Together, we will discuss how such data might be misinterpreted, reduced, or stripped of context when filtered through AI systems.

As a starting point to navigate these problems, Critical Data Literacy is introduced as a framework for reimagining data practices through comprehension, critique, and participation. It equips participants engaging with data-driven systems in education and beyond to interrogate how data is produced, whose knowledge counts, and what is excluded or marginalized.

Participants will leave with reflective questions to guide their own practice: Better for whom? What is not on the screen? Whose goals are being personalized? Without this lens, AI risks accelerating inequities under the guise of objectivity.

From Predictions to Action: The AI Agent Revolution with Fareeha Amber Ansari

Large language models are powerful, but their true potential emerges when they evolve into AI agents which are systems that can reason, plan, and take action autonomously. My talk will explore the shift from using models as passive tools to designing agents that actively interact with data, systems, and people.

I will cover: - Gen AI and Agentic AI – How are These Different - Single Agent (monolithic) and Multi Agent Architectures (modular / distributed) - Open Source and Closed Source AI Systems - Challenges of Integrating Agents with Existing Systems

I will break down the technical building blocks of AI agents, including memory, planning loops, tool integration, and feedback mechanisms. Examples will be used to highlight how agents are being used in workflow automation, knowledge management, and decision support.

I will wrap up with where limitations of AI Agents still pose risks: - Assessing Maturity Cycle of Agents - Cybersecurity Risks of Agents

By the end, attendees will understand: - What makes AI agents different from LLMs - Technical considerations required to build AI Agents responsibly - Applicable knowledge to begin experimenting with agents.

talk
by Tito Osadebey (Keele University; Synectics Solutions; Unify)

Fairness and inclusivity are critical challenges as AI systems influence decisions in healthcare, finance, and everyday life. Yet, most fairness frameworks are developed in limited contexts, often overlooking the data diversity needed for global reliability.

In this talk, Tito Osadebey shares lessons from his research on bias in computer vision models to highlight where fairness efforts often fall short and how data professionals can address these gaps. He’ll outline practical principles for building and evaluating inclusive AI systems, discuss pitfalls that lead to hidden biases, and explore what “fairness” really means in practice.

Tito Osadebey is an AI researcher and data scientist whose work focuses on fairness, inclusivity, and ethical representation in AI systems. He recently published a paper on bias in computer vision models using Nigerian food images, which examines how underrepresentation of the Global South affects model performance and trust.

Tito has contributed to research and industry projects spanning computer vision, NLP, GenAI and data science with organisations including Keele University, Synectics Solutions, and Unify. His work has been featured on BBC Radio, and he led a team from Keele University which secured 3rd place globally at the 2025 IEEE MetroXraine Forensic Handwritten Document Analysis Challenge.

He is passionate about making AI systems more inclusive, context-aware, and equitable bridging the gap between technical innovation and human understanding.

Summary In this episode of the Data Engineering Podcast Ariel Pohoryles, head of product marketing for Boomi's data management offerings, talks about a recent survey of 300 data leaders on how organizations are investing in data to scale AI. He shares a paradox uncovered in the research: while 77% of leaders trust the data feeding their AI systems, only 50% trust their organization's data overall. Ariel explains why truly productionizing AI demands broader, continuously refreshed data with stronger automation and governance, and highlights the challenges posed by unstructured data and vector stores. The conversation covers the need to shift from manual reviews to automated pipelines, the resurgence of metadata and master data management, and the importance of guardrails, traceability, and agent governance. Ariel also predicts a growing convergence between data teams and application integration teams and advises leaders to focus on high-value use cases, aggressive pipeline automation, and cataloging and governing the coming sprawl of AI agents, all while using AI to accelerate data engineering itself.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Ariel Pohoryles about data management investments that organizations are making to enable them to scale AI implementationsInterview IntroductionHow did you get involved in the area of data management?Can you start by describing the motivation and scope of your recent survey on data management investments for AI across your respondents?What are the key takeaways that were most significant to you?The survey reveals a fascinating paradox: 77% of leaders trust the data used by their AI systems, yet only half trust their organization's overall data quality. For our data engineering audience, what does this suggest about how companies are currently sourcing data for AI? Does it imply they are using narrow, manually-curated "golden datasets," and what are the technical challenges and risks of that approach as they try to scale?The report highlights a heavy reliance on manual data quality processes, with one expert noting companies feel it's "not reliable to fully automate validation" for external or customer data. At the same time, maturity in "Automated tools for data integration and cleansing" is low, at only 42%. What specific technical hurdles or organizational inertia are preventing teams from adopting more automation in their data quality and integration pipelines?There was a significant point made that with generative AI, "biases can scale much faster," making automated governance essential. From a data engineering perspective, how does the data management strategy need to evolve to support generative AI versus traditional ML models? What new types of data quality checks, lineage tracking, or monitoring for feedback loops are required when the model itself is generating new content based on its own outputs?The report champions a "centralized data management platform" as the "connective tissue" for reliable AI. How do you see the scale and data maturity impacting the realities of that effort?How do architectural patterns in the shape of cloud warehouses, lakehouses, data mesh, data products, etc. factor into that need for centralized/unified platforms?A surprising finding was that a third of respondents have not fully grasped the risk of significant inaccuracies in their AI models if they fail to prioritize data management. In your experience, what are the biggest blind spots for data and analytics leaders?Looking at the maturity charts, companies rate themselves highly on "Developing a data management strategy" (65%) but lag significantly in areas like "Automated tools for data integration and cleansing" (42%) and "Conducting bias-detection audits" (24%). If you were advising a data engineering team lead based on these findings, what would you tell them to prioritize in the next 6-12 months to bridge the gap between strategy and a truly scalable, trustworthy data foundation for AI?The report states that 83% of companies expect to integrate more data sources for their AI in the next year. For a data engineer on the ground, what is the most important capability they need to build into their platform to handle this influx?What are the most interesting, innovative, or unexpected ways that you have seen teams addressing the new and accelerated data needs for AI applications?What are some of the noteworthy trends or predictions that you have for the near-term future of the impact that AI is having or will have on data teams and systems?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links BoomiData ManagementIntegration & Automation DemoAgentstudioData Connector Agent WebinarSurvey ResultsData GovernanceShadow ITPodcast EpisodeThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Building Inference Workflows with Tile Languages

The world of generative AI is expanding. New models are hitting the market daily. The field has bifurcated between model training and model inference. The need for fast inference has led to numerous Tile languages to be developed. These languages use concepts from linear algebra and borrow common numpy apis. In this talk we will show how tiling works and how to build inference models from scratch in pure Python with embedded tile languages. The goal is to provide attendees with a good overview that can be integrated in common data pipelines.

Red Teaming AI: Getting Started with PyRIT for Safer Generative AI Systems

As generative AI systems become more powerful and widely deployed, ensuring safety and security is critical. This talk introduces AI red teaming—systematically probing AI systems to uncover potential risks—and demonstrates how to get started using PyRIT (Python Risk Identification Toolkit), an open-source framework for automated and semi-automated red teaming of generative AI systems. Attendees will leave with a practical understanding of how to identify and mitigate risks in AI applications, and how PyRIT can help along the way.

Sujay Dutta and Sidd Rajagopal, authors of "Data as the Fourth Pillar," join the show to make the compelling case that for C-suite leaders obsessed with AI, data must be elevated to the same level as people, process, and technology. They provide a practical playbook for Chief Data Officers (CDOs) to escape the "cost center" trap by focusing on the "demand side" (business value) instead of just the "supply side" (technology). They also introduce frameworks like "Data Intensity" and "Total Addressable Value (TAV)" for data. We also tackle the reality of AI "slopware" and the "Great Pacific garbage patch" of junk data , explaining how to build the critical "context" (or "Data Intelligence Layer") that most GenAI projects are missing. Finally, they explain why the CDO must report directly to the CEO to play "offense," not defense.

Most generative AI projects look impressive in a demo but fail in the real world. This session moves beyond the hype to offer a practical, engineering-focused playbook on the architectural patterns and hard-won lessons required to take your LLM application from a cool prototype to a scalable product serving thousands of users. We'll uncover the unglamorous but essential truths about observability, routing, and a production-first mindset.

Data quality and AI reliability are two sides of the same coin in today's technology landscape. Organizations rushing to implement AI solutions often discover that their underlying data infrastructure isn't prepared for these new demands. But what specific data quality controls are needed to support successful AI implementations? How do you monitor unstructured data that feeds into your AI systems? When hallucinations occur, is it really the model at fault, or is your data the true culprit? Understanding the relationship between data quality and AI performance is becoming essential knowledge for professionals looking to build trustworthy AI systems. Shane Murray is a seasoned data and analytics executive with extensive experience leading digital transformation and data strategy across global media and technology organizations. He currently serves as Senior Vice President of Digital Platform Analytics at Versant Media, where he oversees the development and optimization of analytics capabilities that drive audience engagement and business growth. In addition to his corporate leadership role, he is a founding member of InvestInData, an angel investor collective of data leaders supporting early-stage startups advancing innovation in data and AI. Prior to joining Versant Media, Shane spent over three years at Monte Carlo, where he helped shape AI product strategy and customer success initiatives as Field CTO. Earlier, he spent nearly a decade at The New York Times, culminating as SVP of Data & Insights, where he was instrumental in scaling the company’s data platforms and analytics functions during its digital transformation. His earlier career includes senior analytics roles at Accenture Interactive, Memetrics, and Woolcott Research. Based in New York, Shane continues to be an active voice in the data community, blending strategic vision with deep technical expertise to advance the role of data in modern business. In the episode, Richie and Shane explore AI disasters and success stories, the concept of being AI-ready, essential roles and skills for AI projects, data quality's impact on AI, and much more. Links Mentioned in the Show: Versant MediaConnect with ShaneCourse: Responsible AI PracticesRelated Episode: Scaling Data Quality in the Age of Generative AI with Barr Moses, CEO of Monte Carlo Data, Prukalpa Sankar, Cofounder at Atlan, and George Fraser, CEO at FivetranRewatch RADAR AI  New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Bridging the Gap: Using Generative AI for Audience Insight & Segmentation

Seats are limited to 16 attendees. Register here to save your spot. 

https://www.snowflake.com/event/marketing-data-stack-roundtable-swt-amsterdam-2025/

This roundtable explores how generative AI (GenAI) is revolutionizing audience segmentation and insights. The discussion will focus on practical, in-the-moment applications that empower marketers and media professionals to move beyond static data analysis. We will examine how GenAI tools, like those available natively on Snowflake Cortex, can translate complex data filters into rich, narrative-driven audience descriptions. 

The conversation will also highlight how GenAI capabilities streamline workflows by allowing users to build audience segments using natural language, democratizing access to data and accelerating decision-making. The goal is to provide a clear, concise, and actionable understanding of how GenAI is bridging the gap between raw data and powerful, human-centric insights.

--- Miami CDO Cheriene Floyd shares how Generative AI is shifting the way cities think about their data.

--- A Chief Data Officer’s role in cities is to turn data into a strategic asset, enabling insights that can be leveraged for resident impact. How is this responsibility changing in the age of generative AI?

--- We’re joined today by Cheriene Floyd to discuss the shift in how CDOs are making data work for their residents. Floyd discusses her path from serving as a strategic planning and performance manager in the City of Miami to becoming the city’s first Chief Data Officer. During her ten years of service as a CDO, she has come to view the role as upholding three key pillars: data governance, analytics, and capacity-building, helping departments connect the dots between disparate datasets to see the bigger picture.

--- As AI changes our relationship to data, it further highlights the adage, “garbage in, garbage out.” Floyd discusses how broad awareness of this truth has manifested in greater buy-in among city staff to leverage data to solve problems, while private sector AI adoption has shifted residents’ expectations when seeking public services. Consequently, the task of shepherding public data becomes even more important, and she offers recommendations from her own experiences to meet these challenges.

--- Learn more about GovEx!

The promise of AI in enterprise settings is enormous, but so are the privacy and security challenges. How do you harness AI's capabilities while keeping sensitive data protected within your organization's boundaries? Private AI—using your own models, data, and infrastructure—offers a solution, but implementation isn't straightforward. What governance frameworks need to be in place? How do you evaluate non-deterministic AI systems? When should you build in-house versus leveraging cloud services? As data and software teams evolve in this new landscape, understanding the technical requirements and workflow changes is essential for organizations looking to maintain control over their AI destiny. Manasi Vartak is Chief AI Architect and VP of Product Management (AI Platform) at Cloudera. She is a product and AI leader with more than a decade of experience at the intersection of AI infrastructure, enterprise software, and go-to-market strategy. At Cloudera, she leads product and engineering teams building low-code and high-code generative AI platforms, driving the company’s enterprise AI strategy and enabling trusted AI adoption across global organizations. Before joining Cloudera through its acquisition of Verta, Manasi was the founder and CEO of Verta, where she transformed her MIT research into enterprise-ready ML infrastructure. She scaled the company to multi-million ARR, serving Fortune 500 clients in finance, insurance, and capital markets, and led the launch of enterprise MLOps and GenAI products used in mission-critical workloads. Manasi earned her PhD in Computer Science from MIT, where she pioneered model management systems such as ModelDB — foundational work that influenced the development of tools like MLflow. Earlier in her career, she held research and engineering roles at Twitter, Facebook, Google, and Microsoft. In the episode, Richie and Manasi explore AI's role in financial services, the challenges of AI adoption in enterprises, the importance of data governance, the evolving skills needed for AI development, the future of AI agents, and much more. Links Mentioned in the Show: ClouderaCloudera Evolve ConferenceCloudera Agent StudioConnect with ManasiCourse: Introduction to AI AgentsRelated Episode: RAG 2.0 and The New Era of RAG Agents with Douwe Kiela, CEO at Contextual AI & Adjunct Professor at Stanford UniversityRewatch RADAR AI  New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Adnan Hodzic, Lead Engineer and GenAI Delivery Lead at ING, joined Yuliia how ING successfully scaled generative AI from experimentation to enterprise production. With over 60 GenAI applications now running in production across the bank, Adnan explains ING's pragmatic approach: building internal AI platforms that balance innovation speed with regulatory compliance, treating European banking regulations as features rather than constraints, and fostering a culture where 300+ experiments can safely run while only the best reach production. He discusses the critical role of their Prompt Flow Studio in democratizing AI development, why customer success teams saw immediate productivity gains, how ING structures AI governance without killing innovation, and his perspective on the hype cycle versus real enterprise value. Adnan's blog: https://foolcontrol.org Adnan's Youtube channel: https://www.youtube.com/AdnanHodzicLinkedIn: https://linkedin.com/in/AdnanHodzicTwitter/X: https://twitter.com/fooctrl