talk-data.com
People (20 results)
See all 20 →Activities & events
| Title & Speakers | Event |
|---|---|
|
The AI Data Paradox: High Trust in Models, Low Trust in Data
2025-11-09 · 23:53
Ariel Pohoryles
– guest
@ Rivery
,
Tobias Macey
– host
Summary In this episode of the Data Engineering Podcast Ariel Pohoryles, head of product marketing for Boomi's data management offerings, talks about a recent survey of 300 data leaders on how organizations are investing in data to scale AI. He shares a paradox uncovered in the research: while 77% of leaders trust the data feeding their AI systems, only 50% trust their organization's data overall. Ariel explains why truly productionizing AI demands broader, continuously refreshed data with stronger automation and governance, and highlights the challenges posed by unstructured data and vector stores. The conversation covers the need to shift from manual reviews to automated pipelines, the resurgence of metadata and master data management, and the importance of guardrails, traceability, and agent governance. Ariel also predicts a growing convergence between data teams and application integration teams and advises leaders to focus on high-value use cases, aggressive pipeline automation, and cataloging and governing the coming sprawl of AI agents, all while using AI to accelerate data engineering itself. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Ariel Pohoryles about data management investments that organizations are making to enable them to scale AI implementationsInterview IntroductionHow did you get involved in the area of data management?Can you start by describing the motivation and scope of your recent survey on data management investments for AI across your respondents?What are the key takeaways that were most significant to you?The survey reveals a fascinating paradox: 77% of leaders trust the data used by their AI systems, yet only half trust their organization's overall data quality. For our data engineering audience, what does this suggest about how companies are currently sourcing data for AI? Does it imply they are using narrow, manually-curated "golden datasets," and what are the technical challenges and risks of that approach as they try to scale?The report highlights a heavy reliance on manual data quality processes, with one expert noting companies feel it's "not reliable to fully automate validation" for external or customer data. At the same time, maturity in "Automated tools for data integration and cleansing" is low, at only 42%. What specific technical hurdles or organizational inertia are preventing teams from adopting more automation in their data quality and integration pipelines?There was a significant point made that with generative AI, "biases can scale much faster," making automated governance essential. From a data engineering perspective, how does the data management strategy need to evolve to support generative AI versus traditional ML models? What new types of data quality checks, lineage tracking, or monitoring for feedback loops are required when the model itself is generating new content based on its own outputs?The report champions a "centralized data management platform" as the "connective tissue" for reliable AI. How do you see the scale and data maturity impacting the realities of that effort?How do architectural patterns in the shape of cloud warehouses, lakehouses, data mesh, data products, etc. factor into that need for centralized/unified platforms?A surprising finding was that a third of respondents have not fully grasped the risk of significant inaccuracies in their AI models if they fail to prioritize data management. In your experience, what are the biggest blind spots for data and analytics leaders?Looking at the maturity charts, companies rate themselves highly on "Developing a data management strategy" (65%) but lag significantly in areas like "Automated tools for data integration and cleansing" (42%) and "Conducting bias-detection audits" (24%). If you were advising a data engineering team lead based on these findings, what would you tell them to prioritize in the next 6-12 months to bridge the gap between strategy and a truly scalable, trustworthy data foundation for AI?The report states that 83% of companies expect to integrate more data sources for their AI in the next year. For a data engineer on the ground, what is the most important capability they need to build into their platform to handle this influx?What are the most interesting, innovative, or unexpected ways that you have seen teams addressing the new and accelerated data needs for AI applications?What are some of the noteworthy trends or predictions that you have for the near-term future of the impact that AI is having or will have on data teams and systems?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links BoomiData ManagementIntegration & Automation DemoAgentstudioData Connector Agent WebinarSurvey ResultsData GovernanceShadow ITPodcast EpisodeThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA |
Data Engineering Podcast |
|
[Online] Democratizing Bayesian Modeling with Insight Agents: A Case Study
2025-06-17 · 16:00
🎙️ Speaker: Andy Heusser\, Luca Fiaschi \| ⏰ Time: 4 PM UTC / 9 AM PT / 12 PM ET / 6 PM Berlin Insight Agents are purpose‑built AI coworkers that transform demanding analytical workflows into push‑button tasks. Built on a modular blend of retrieval‑augmented generation (RAG), tool calling, and sandboxed code execution, each agent automates the full statistical pipeline—from data exploration and validation to model fitting and interpretation—without requiring deep technical expertise. The session showcases our Marketing Mix Modeling (MMM) Insight Agent, which compresses weeks of Bayesian MMM work into minutes by delegating tasks to specialized sub‑agents. You’ll see how this architecture delivers secure, explainable, and scalable results that let marketers focus on strategy instead of code. MMM is only the first stop. We plan to extend the same framework to prototype Insight Agents for customer life-time value, causal impact analysis and more. We’ll dig into the design principles, share implementation lessons, and outline the roadmap from today’s collaborative “copilots” to tomorrow’s autonomous digital coworkers that proactively surface insights and drive better business outcomes. Read More:
📜 Outline of Talk / Agenda:
💼 About the speaker:
🔗 Connect with Andy: 👉 Linkedin: https://www.linkedin.com/in/andrew-heusser-3b6587b1/ 👉Github: https://github.com/andrewheusser
🔗 Connect with Luca: 👉 Linkedin: https://www.linkedin.com/in/lfiaschi/ 💼 About the Host:
📖 Code of Conduct: Please note that participants are expected to abide by PyMC's Code of Conduct. 🔗 Connecting with PyMC Labs: 🌐 Website: https://www.pymc-labs.com/ 👥 LinkedIn: https://www.linkedin.com/company/pymc-labs/ 🐦 Twitter: https://twitter.com/pymc_labs 🎥 YouTube: https://www.youtube.com/c/PyMCLabs 🤝 Meetup: https://www.meetup.com/pymc-labs-online-meetup/ |
[Online] Democratizing Bayesian Modeling with Insight Agents: A Case Study
|
|
Intro to SurrealDB: Leveraging Graphs and Vectors in your projects
2024-11-12 · 23:30
Join us for our next developer meetup, where we'll be exploring Vector search and Graph use cases in SurrealDB, and how you can leverage this functionality in your own projects. Whether you're a software engineer, developer, architect, data scientist, or data engineer, this event is for you. Expect an evening filled with insightful discussions, networking opportunities, and the chance to learn more about the latest in database technology! 🗣️ Speaker opportunity - submit your talk! Working on an interesting project that you would like to share with the community? Submit your talk here. ⏰ Date/time: November 12, 6:30 - 9:00PM 📍 Location: The Yard: Columbus Circle Coworking Office Space NYC Agenda 18:30 - 19:00 Welcome drinks, pizza & networking Attendees arrive – grab a drink, explore the space and meet the SurrealDB team. 19:00 - 19:30 Intro to SurrealDB: Leveraging Graphs and Vectors in your projects Alessandro Pireno, Director Solutions Engineering at SurrealDB This talk will explore the world of Retrieval-Augmented Generation (RAG) systems. Using Jupyter notebooks, we will demonstrate a graph-based approach to RAG, leveraging the unique capabilities of SurrealDB. The presentation will show how to construct a knowledge graph from unstructured text data, covering key aspects like entity and relation extraction. Then, using the created knowledge graph, the talk will demonstrate how to implement and utilize a graph-based RAG system for question answering. The talk will highlight SurrealDB's graph capabilities for representing relationships between entities and its vector capabilities for semantic search within the knowledge graph. Attendees will learn how to construct a knowledge graph, use the graph structure for contextually rich question answering, and use SurrealDB's graph and vector features to power a RAG system 19:30 - 20:00 Refreshments & networking Connect with others in the tech community. Grab a slice of pizza & a drink and chat with other attendees and members of the SurrealDB team. 20:00 - 20:30 Understanding the AI & ML landscape Ashok Chandra, Alessandro Pireno Ashok will draw on his legal expertise and understanding of data privacy to discuss the legal and ethical considerations for developers and data scientists using generative AI (GenAI) to generate or extract data. 20:30 - 21:00 Refreshments and networking 21:00 End of event -- Speaker: Alessandro Pireno \| LinkedIn Alessandro is a seasoned product development and solutions leader with a proven track record of building and scaling data-driven solutions across diverse industries. He has led product strategy and development at companies like HUMAN and Omnicom Media Group, optimized data collection and distribution at GroupM, and was an early leader of success at Snowflake. With a deep understanding of the challenges and opportunities facing today’s tech landscape, Alessandro is passionate about empowering organizations to unlock the full potential of their data through innovative database solutions. Speaker: Ashok Chandra \| LinkedIn Ashok Chandra is an attorney and finance professional with expertise in data privacy and intellectual property law. He holds certifications as a Certified Information Privacy Professional (CIPP/US) and a Certified Information Privacy Manager (CIPM) from the International Association of Privacy Professionals (IAPP). He is also a member of the New York State Bar Association, the Washington State Bar Association, and the United States Patent and Trademark Office Bar. Ashok's educational background includes a degree from New York University's Leonard N. Stern School of Business, where he was active in the Entertainment and Media Association. He also served as the Managing Editor of the Fordham Intellectual Property, Media, and Entertainment Law Journal. His published work includes a note on how companies attempt to extend copyright protection through derivative works and trademarks. -- 👉 New to SurrealDB? Get started here. FAQs Is the venue accessible? The Yard is located on the 2nd floor. When you arrive, just let security know that you're heading up to The Yard. Am I guaranteed a ticket at this event? Our events are tech-focused and in the interest of keeping our events relevant and meaningful for those attending, tickets are issued at our discretion. We therefore reserve the right to refund ticket orders before the event and to request proof of identity and/or professional background upon entry. Is this event for me? SurrealDB events are for software engineers, developers, architects, data scientists, data engineers, or any tech professionals keen to discover more about SurrealDB: a scalable multi-model database that allows users and developers to focus on building their applications with ease and speed. Are there any House Rules? At SurrealDB, we are committed to providing live and online events that are safe and enjoyable for all attending. Please review our Code of Conduct and Privacy Policy for more information. It is compulsory for all attendees to be registered with a first and last name in order to attend. Any attendees who do not adhere to these requirements will be refused a ticket. |
Intro to SurrealDB: Leveraging Graphs and Vectors in your projects
|
|
#240 Generative AI in the Enterprise with Steve Holden, Senior Vice President and Head of Single-Family Analytics at Fannie Mae
2024-09-02 · 10:00
Steve Holden
– Senior Vice President and Head of Single-Family Analytics
@ Fannie Mae
The rapid rise of generative AI is changing how businesses operate, but with this change comes new challenges. How do you navigate the balance between innovation and risk, especially in a regulated industry? As organizations race to adopt AI, it’s crucial to ensure that these technologies are not only transformative but also responsible. What steps can you take to harness AI’s potential while maintaining control and transparency? And how can you build excitement and trust around AI within your organization, ensuring that everyone is ready to embrace this new era? Steve Holden is the Senior Vice President and Head of Single-Family Analytics at Fannie Mae, leading a team of data science professionals, supporting loan underwriting, pricing and acquisition, securitization, loss mitigation, and loan liquidation for the company’s multi-trillion-dollar Single-Family mortgage portfolio. He is also responsible for all Generative AI initiatives across the enterprise. His team provides real-time analytic solutions that guide thousands of daily business decisions necessary to manage this extensive mortgage portfolio. The team comprises experts in econometric models, machine learning, data engineering, data visualization, software engineering, and analytic infrastructure design. Holden previously served as Vice President of Credit Portfolio Management Analytics at Fannie Mae. Before joining Fannie Mae in 1999, he held several analytic leadership roles and worked on economic issues at the Economic Strategy Institute and the U.S. Bureau of Labor Statistics. In the episode Adel and Steve explore opportunities in generative AI, building a GenAI program, use-case prioritization, driving excitement and engagement for an AI-first culture, skills transformation, governance as a competitive advantage, challenges of scaling AI, future trends in AI, and much more. Links Mentioned in the Show: Fannie MaeSteve’s recent DataCamp Webinar: Bringing Generative AI to the EnterpriseVideo: Andrej Karpathy - [1hr Talk] Intro to Large Language ModelsSkill Track - AI Business FundamentalsRelated Episode: Generative AI at EY with John Thompson, Head of AI at EYRewatch sessions from RADAR: AI Edition Join the DataFramed team! Data Evangelist Data & AI Video Creator New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business |
DataFramed |
|
Data Strategy and an Intro to Generative AI
2024-07-08 · 17:00
Updated agenda below. Hosted and Sponsored by Snap Analytics - snapanalytics.co.uk - an enterprise partner to help connect data\, technology and teams to deliver exceptional business outcomes. Location: Runway East Bristol Bridge, 1 Victoria St, Redcliffe, Bristol BS1 6AA AGENDA 18.00 – 18:20 Meet & Greet -------------- 18:20 - 19:10 Intro to Generative AI by Calvin Fuss Note, this talk was "From Words to Wisdom: Creating SQL and Insights using Conversational AI" by Isaac Ben-Akiva but unfortunately Isaac is poorly and unable to make it. Intro to Generative AI Query your data using AI - Demo Solve real world business problems using AI – Demo AI Product architecture Business benefits BIO Calvin’s career has allowed him to apply real world experience to his academical groundings in computer science. His passion for data and software development has driven impactful projects for some of the world’s largest corporations, including BAT, Imperial Tobacco, and Unilever, where he led data analytics, data science, and artificial intelligence consulting engagements. -------------- 19:10 - 19:40 Pizza and Networking -------------- 19:40 - 20:30 From Buzzwords to Breakthroughs: How to create an effective Data Strategy by David Rice Are you tired of playing buzzword bingo with every new data announcement, and want to get practical, this one is for you. In this session, you’ll get to learn about: The 4 Pillars of an Effective Data Strategy, Creating the Data Value Map to Identify Use Cases, The Prioritisation Matrix for Data & Analytics and Getting Buy-In To Fund Your Project. BIO David has 20 years of experience in business intelligence, data warehousing, and analytics, He is passionate about helping businesses leverage data to achieve their strategic goals. As the CEO and Co-founder of Snap Analytics, He leads a team of data experts who deliver solutions that simplify and accelerate data projects for some of the world's biggest brands. -------------- 20:30 - Pub -------------- About Snap Analytics - https://snapanalytics.co.uk/ Snap Analytics partners with enterprises to help them to connect their data, technology and teams to deliver exceptional business outcomes. ---- Photos We ask that you do NOT take photos at this meetup. We will invite people to be included in a group photo/s during the event. Speakers will let you know if it's okay to photograph their presentation (excluding other attendees). You may see organisers taking photos during the talks. These will be of speakers, if they have agreed to this, and will not include faces of attendees. |
Data Strategy and an Intro to Generative AI
|