talk-data.com talk-data.com

Topic

Data Governance

data_management compliance data_quality

417

tagged

Activity Trend

90 peak/qtr
2020-Q1 2026-Q1

Activities

417 activities · Newest first

--- Miami CDO Cheriene Floyd shares how Generative AI is shifting the way cities think about their data.

--- A Chief Data Officer’s role in cities is to turn data into a strategic asset, enabling insights that can be leveraged for resident impact. How is this responsibility changing in the age of generative AI?

--- We’re joined today by Cheriene Floyd to discuss the shift in how CDOs are making data work for their residents. Floyd discusses her path from serving as a strategic planning and performance manager in the City of Miami to becoming the city’s first Chief Data Officer. During her ten years of service as a CDO, she has come to view the role as upholding three key pillars: data governance, analytics, and capacity-building, helping departments connect the dots between disparate datasets to see the bigger picture.

--- As AI changes our relationship to data, it further highlights the adage, “garbage in, garbage out.” Floyd discusses how broad awareness of this truth has manifested in greater buy-in among city staff to leverage data to solve problems, while private sector AI adoption has shifted residents’ expectations when seeking public services. Consequently, the task of shepherding public data becomes even more important, and she offers recommendations from her own experiences to meet these challenges.

--- Learn more about GovEx!

The promise of AI in enterprise settings is enormous, but so are the privacy and security challenges. How do you harness AI's capabilities while keeping sensitive data protected within your organization's boundaries? Private AI—using your own models, data, and infrastructure—offers a solution, but implementation isn't straightforward. What governance frameworks need to be in place? How do you evaluate non-deterministic AI systems? When should you build in-house versus leveraging cloud services? As data and software teams evolve in this new landscape, understanding the technical requirements and workflow changes is essential for organizations looking to maintain control over their AI destiny. Manasi Vartak is Chief AI Architect and VP of Product Management (AI Platform) at Cloudera. She is a product and AI leader with more than a decade of experience at the intersection of AI infrastructure, enterprise software, and go-to-market strategy. At Cloudera, she leads product and engineering teams building low-code and high-code generative AI platforms, driving the company’s enterprise AI strategy and enabling trusted AI adoption across global organizations. Before joining Cloudera through its acquisition of Verta, Manasi was the founder and CEO of Verta, where she transformed her MIT research into enterprise-ready ML infrastructure. She scaled the company to multi-million ARR, serving Fortune 500 clients in finance, insurance, and capital markets, and led the launch of enterprise MLOps and GenAI products used in mission-critical workloads. Manasi earned her PhD in Computer Science from MIT, where she pioneered model management systems such as ModelDB — foundational work that influenced the development of tools like MLflow. Earlier in her career, she held research and engineering roles at Twitter, Facebook, Google, and Microsoft. In the episode, Richie and Manasi explore AI's role in financial services, the challenges of AI adoption in enterprises, the importance of data governance, the evolving skills needed for AI development, the future of AI agents, and much more. Links Mentioned in the Show: ClouderaCloudera Evolve ConferenceCloudera Agent StudioConnect with ManasiCourse: Introduction to AI AgentsRelated Episode: RAG 2.0 and The New Era of RAG Agents with Douwe Kiela, CEO at Contextual AI & Adjunct Professor at Stanford UniversityRewatch RADAR AI  New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Summary In this episode of the Data Engineering Podcast Matt Topper, president of UberEther, talks about the complex challenge of identity, credentials, and access control in modern data platforms. With the shift to composable ecosystems, integration burdens have exploded, fracturing governance and auditability across warehouses, lakes, files, vector stores, and streaming systems. Matt shares practical solutions, including propagating user identity via JWTs, externalizing policy with engines like OPA/Rego and Cedar, and using database proxies for native row/column security. He also explores catalog-driven governance, lineage-based label propagation, and OpenTDF for binding policies to data objects. The conversation covers machine-to-machine access, short-lived credentials, workload identity, and constraining access by interface choke points, as well as lessons from Zanzibar-style policy models and the human side of enforcement. Matt emphasizes the need for trust composition - unifying provenance, policy, and identity context - to answer questions about data access, usage, and intent across the entire data path.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Composable data infrastructure is great, until you spend all of your time gluing it together. Bruin is an open source framework, driven from the command line, that makes integration a breeze. Write Python and SQL to handle the business logic, and let Bruin handle the heavy lifting of data movement, lineage tracking, data quality monitoring, and governance enforcement. Bruin allows you to build end-to-end data workflows using AI, has connectors for hundreds of platforms, and helps data teams deliver faster. Teams that use Bruin need less engineering effort to process data and benefit from a fully integrated data platform. Go to dataengineeringpodcast.com/bruin today to get started. And for dbt Cloud customers, they'll give you $1,000 credit to migrate to Bruin Cloud.Your host is Tobias Macey and today I'm interviewing Matt Topper about the challenges of managing identity and access controls in the context of data systemsInterview IntroductionHow did you get involved in the area of data management?The data ecosystem is a uniquely challenging space for creating and enforcing technical controls for identity and access control. What are the key considerations for designing a strategy for addressing those challenges?For data acess the off-the-shelf options are typically on either extreme of too coarse or too granular in their capabilities. What do you see as the major factors that contribute to that situation?Data governance policies are often used as the primary means of identifying what data can be accesssed by whom, but translating that into enforceable constraints is often left as a secondary exercise. How can we as an industry make that a more manageable and sustainable practice?How can the audit trails that are generated by data systems be used to inform the technical controls for identity and access?How can the foundational technologies of our data platforms be improved to make identity and authz a more composable primitive?How does the introduction of streaming/real-time data ingest and delivery complicate the challenges of security controls?What are the most interesting, innovative, or unexpected ways that you have seen data teams address ICAM?What are the most interesting, unexpected, or challenging lessons that you have learned while working on ICAM?What are the aspects of ICAM in data systems that you are paying close attention to?What are your predictions for the industry adoption or enforcement of those controls?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links UberEtherJWT == JSON Web TokenOPA == Open Policy AgentRegoPingIdentityOktaMicrosoft EntraSAML == Security Assertion Markup LanguageOAuthOIDC == OpenID ConnectIDP == Identity ProviderKubernetesIstioAmazon CEDAR policy languageAWS IAMPII == Personally Identifiable InformationCISO == Chief Information Security OfficerOpenTDFOpenFGAGoogle ZanzibarRisk Management FrameworkModel Context ProtocolGoogle Data ProjectTPM == Trusted Platform ModulePKI == Public Key InfrastructurePassskeysDuckLakePodcast EpisodeAccumuloJDBCOpenBaoHashicorp VaultLDAPThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

F. Hoffmann-La Roche is the world’s leading provider of cancer treatments, biotech company, 4th largest pharmaceutical company and currently Europe’s 3rd largest company by market cap.

This session will explore Roche’s Snowflake environment, Approach to Data Mesh incl. Object tagging as mandatory Data Governance, Cortex AI incl. MCP Server, Data Observability supporting Data Mesh incl. Use Case deep dive & success stories, and Roadmap with Pharma Technical Domain.

See how a global electronics manufacturer demonstrated the value of modern data governance and quality by integrating Semarchy MDM natively on Snowflake. This session highlights the reference architecture, unification of critical data domains, and seamless governance at scale—showing how trusted, high-quality data empowers innovation and business impact.

Chez Doctolib, la gouvernance data ne se limite pas à la conformité: elle soutient activement notre stratégie d’entreprise. Dans cette session, Diana Carrondo, Data Governance Lead chez Doctolib, et Tristan Mayer, General Manager Catalog chez Coalesce, partageront comment un nouveau data catalog a permis de déployer une approche de gouvernance offensive. Vous découvrirez comment Doctolib a dépassé les limites de son ancien outil en améliorant l’adoption, en structurant la taxonomie pour mieux protéger les données, en intégrant le catalog dans ses KPIs data governance et en le connectant à ses outils IA internes. Un retour d’expérience concret pour transformer votre catalog en levier stratégique.

Selecting a suitable and high performing target group for CRM initiatives—such as newsletters and coupons—often involves time-consuming, manual coordination across multiple teams. In this session, we will demonstrate how we leveraged the combined strengths of Snowpark, Streamlit, and dbt to build a self-service application that allows CRM managers to define target groups independently—without relying on analytics resources. 

Our solution delivers real-time feedback based on user input, dramatically reducing turnaround times and simplifying the targeting workflow. We will explore how Snowpark acts as a seamless bridge between Streamlit and Snowflake, enabling efficient, in-database processing. Meanwhile, dbt ensures data consistency and reusability through standardized data products. 

Join us to discover how this integrated approach accelerates decision-making, ensures data governance, and unlocks scalable, self-service capabilities for your CRM teams.

It’s no secret that AI is reliant on ‘rock solid’ data. However given the vast amounts of data that companies now have spread across a distributed SaaS, on-premises and multi-cloud data estate, many companies they are a million miles away from this. We are also well past the point where people can govern data on their own. They need help and a total rethink is now needed to conquer data complexity and create a high quality, compliant data foundation for AI Success.

 

In this watershed keynote, conference char Mike Ferguson details what needs to be done to govern data in the era of AI, how companies can conquer the complexity they face, by implementing an always on, active and unified approach to data governance to continuously detect, automate and consistently enforce multiple types of policies across a distributed data estate. The session will cover:

• Current problems with data governance today and why old approaches are broken

• Requirements to dramatically improve data governance using AI and AI automation

• The need for an integrated and unified data governance platform

• Why a data catalog, data intelligence, data observability, AI Agents and orchestration all need to be integrated for AI-Assisted active data governance

• Understanding the AI-assisted data governance services and AI-Agents you need

• Establishing health metrics to measure effectiveness of your data governance program

• Creating a Data Governance Action Framework for your enterprise

• Monitoring the health and security of your data using data governance observability

• Enabling continuous reporting and AI-Assisted data governance action automation

• Implementing data governance AI Agents for different data governance disciplines

While 95% of enterprise AI pilots fail to deliver business value, Secoda's customers are seeing a different reality: with 76% of AI usage focused on core business intelligence workflows rather than isolated experiments. 

Join Etai Mizrahi, Co-Founder & CEO of Secoda, as he shares how companies like Dialpad achieved company-wide AI adoption by moving 200+ employees from traditional dashboards to natural language analytics. Learn how Secoda's multi-agent AI architecture transforms data governance from manual overhead into automated workflows, and learn practical strategies for scaling AI beyond pilots to become essential infrastructure that delivers measurable ROI.

When Virgin Media and O2 merged, they faced the challenge of unifying thousands of pipelines and platforms while keeping 25 million customers connected. Victor Rivero, Head of Data Governance & Quality, shares how his team is transforming his data estate into a trusted source of truth by embedding Monte Carlo’s Data + AI Observability across BigQuery, Atlan, dbt, and Tableau. Learn how they've begun their journey to cut data downtime, enforced reliability dimensions, and measured success while creating a scalable blueprint for enterprise observability.

For years, data governance has been about guiding people and their interpretations. We build glossaries, descriptions and documentation to keep analysts and business users aligned. But what happens when your primary “user” isn’t human? As agentic workflows, LLMs, and AI-driven decision systems become mainstream, the way we govern data must evolve. The controls that once relied on human interpretation now need to be machine-readable, unambiguous, and able to support near-real-time reasoning. The stakes are high: a governance model designed for people may look perfectly clear to us but lead an AI straight into hallucinations, bias, or costly automation errors.

This session explores what it really means to make governance “AI-ready.” We’ll look at the shift from human-centric to agent-centric governance, practical strategies for structuring metadata so that agents can reliably understand and act on it, and what new risks emerge when AI is the primary consumer of your data catalog. We'll discuss patterns, emerging practices, and a discuss how to transition to a new governance operating model. Whether you’re a data leader, platform engineer, or AI practitioner, you’ll leave with an appreciation of governance approaches for a world where your first stakeholder might not even be human.

Traditional data governance is often insufficient for the amplified risks of live AI models, from bias to black-box decisions. In this session, we'll discuss a capability framework for full-lifecycle AI governance, designed to manage model behavior, build trust, and ensure your AI performs as intended over time.

In this session, we’ll share our transformation journey from a traditional, centralised data warehouse to a modern data lakehouse architecture, powered by data mesh principles. We’ll explore the challenges we faced with legacy systems, the strategic decisions that led us to adopt a lakehouse model, and how data mesh enabled us to decentralise ownership, improve scalability, and enhance data governance.

As AI reshapes every aspect of data management, organizations worldwide are witnessing a fundamental transformation in how data governance operates. This panel discussion, hosted by DataHub, brings together two forward-thinking customers to explore the revolutionary journey from traditional governance models to AI-autonomous systems. Our expert panelists will share real-world experiences navigating the four critical stages of this evolution: AI-assisted governance, where machine learning augments human decision-making; AI-driven governance, where algorithms actively guide policy enforcement; AI-run governance, where systems independently execute complex workflows; and ultimately, AI-autonomous governance, where intelligent systems self-manage and continuously optimize data stewardship processes. Through candid discussions of implementation challenges, measurable outcomes, and strategic insights, attendees will gain practical understanding of how leading organizations are preparing for this transformative shift. The session will address key questions around trust, accountability, and the changing role of data professionals in an increasingly automated governance landscape, providing actionable guidance for organizations at any stage of their AI governance journey.

How do you prepare a global industrial business for AI? At Secil, the answer was data governance. In this session, Ricardo Carvalho shares how the team replaced siloed systems with a unified data platform using Domo, delivering enterprise-level analytics, smarter operations, and a foundation for scalable AI that drives real outcomes in just 18 months.

As organisations scale their data ecosystems, ensuring consistency, compliance, and usability across multiple data products becomes a critical challenge. This session explores a practical approach to implementing a Data Governance framework that balances control with agility.

Key takeaways:

- We will discuss key principles, common pitfalls, and best practices for aligning governance with business objectives while fostering innovation.

- Attendees will gain insights into designing governance policies, automating compliance, and driving adoption across decentralised data teams.

- Real-world examples will illustrate how to create a scalable, federated model that enhances data quality, security, and interoperability across diverse data products.

Three out of four companies are betting big on AI – but most are digging on shifting ground. In this $100 billion gold rush, none of these investments will pay off without data quality and strong governance – and that remains a challenge for many organizations. Not every enterprise has a solid data governance practice and maturity models vary widely. As a result, investments in innovation initiatives are at risk of failure. What are the most important data management issues to prioritize? See how your organization measures up and get ahead of the curve with Actian.