talk-data.com talk-data.com

Topic

Data Quality

data_management data_cleansing data_validation

8

tagged

Activity Trend

82 peak/qtr
2020-Q1 2026-Q1

Activities

Showing filtered results

Filtering by: Barr Moses ×
Building Reliable Agentic AI on Databricks

Agentic AI is the next evolution in artificial intelligence, with the potential to revolutionize the industry. However, its potential is matched only by its risk: without high-quality, trustworthy data, agentic AI can be exponentially dangerous. Join Barr Moses, CEO and Co-Founder of Monte Carlo, to explore how to leverage Databricks' powerful platform to ensure your agentic AI initiatives are underpinned by reliable, high-quality data. Barr will share: How data quality impacts agentic AI performance at every stage of the pipeline Strategies for implementing data observability to detect and resolve data issues in real-time Best practices for building robust, error-resilient agentic AI models on Databricks. Real-world examples of businesses harnessing Databricks' scalability and Monte Carlo’s observability to drive trustworthy AI outcomes Learn how your organization can deliver more reliable agentic AI and turn the promise of autonomous intelligence into a strategic advantage.Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

Panel: Shift Left Across the Data Lifecycle—Data Contracts, Transformations, Observability, and C...

Panel: Shift Left Across the Data Lifecycle—Data Contracts, Transformations, Observability, and Catalogs | Prukalpa Sankar, Tristan Handy, Barr Moses, Chad Sanderson | Shift Left Data Conference 2025

Join industry-leading CEOs Chad (Data Contracts), Tristan (Data Transformations), Barr (Data Observability), and Prukalpa (Data Catalogs) who are pioneering new approaches to operationalizing data by “Shifting Left.” This engaging panel will explore how embedding rigorous data management practices early in the data lifecycle reduces issues downstream, enhances data reliability, and empowers software engineers with clear visibility into data expectations. Attendees will gain insights into how data contracts define accountability, how effective transformations ensure data usability at scale, how proactive how proactive data and AI observability drives continuous confidence in data quality, and how catalogs enable data discoverability, accelerating innovation and trust across organizations.

Generative AI's transformative power underscores the critical need for high-quality data. In this session, Barr Moses, CEO of Monte Carlo Data, Prukalpa Sankar, Cofounder at Atlan, and George Fraser, CEO at Fivetran, discuss the nuances of scaling data quality for generative AI applications, highlighting the unique challenges and considerations that come into play. Throughout the session, they share best practices for data and AI leaders to navigate these challenges, ensuring that governance remains a focal point even amid the AI hype cycle. Links Mentioned in the Show: Rewatch Session from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

Summary Data observability is a product category that has seen massive growth and adoption in recent years. Monte Carlo is in the vanguard of companies who have been enabling data teams to observe and understand their complex data systems. In this episode founders Barr Moses and Lior Gavish rejoin the show to reflect on the evolution and adoption of data observability technologies and the capabilities that are being introduced as the broader ecosystem adopts the practices.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Go to dataengineeringpodcast.com/linode today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don’t forget to thank them for their continued support of this show! Atlan is the metadata hub for your data ecosystem. Instead of locking your metadata into a new silo, unleash its transformative potential with Atlan’s active metadata capabilities. Push information about data freshness and quality to your business intelligence, automatically scale up and down your warehouse based on usage patterns, and let the bots answer those questions in Slack so that the humans can focus on delivering real value. Go to dataengineeringpodcast.com/atlan today to learn more about how Atlan’s active metadata platform is helping pioneering data teams like Postman, Plaid, WeWork & Unilever achieve extraordinary things with metadata and escape the chaos. RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their state-of-the-art reverse ETL pipelines enable you to send enriched data to any cloud tool. Sign up free… or just get the free t-shirt for being a listener of the Data Engineering Podcast at dataengineeringpodcast.com/rudder. The only thing worse than having bad data is not knowing that you have it. With Bigeye’s data observability platform, if there is an issue with your data or data pipelines you’ll know right away and can get it fixed before the business is impacted. Bigeye let’s data teams measure, improve, and communicate the quality of your data to company stakeholders. With complete API access, a user-friendly interface, and automated yet flexible alerting, you’ve got everything you need to establish and maintain trust in your data. Go to dataengineeringpodcast.com/bigeye today to sign up and start trusting your analyses. Your host is Tobias Macey and today I’m interviewing Barr Moses and Lior Gavish about the state of the market for data observability and their own work at Monte Carlo

Interview

Introduction How did you get involved in the area of data management? Can you give the elevator pitch for Monte Carlo?

What are the notable changes in the Monte Carlo product and business since our last conversation in October 2020?

You were one of the early entrants in the market of data quality/data observability products. In your work to gain visibility and traction you invested substantially in content creation (blog posts, presentations, round table conversations, etc.). How would you summarize the focus of your initial efforts? Why do you think data observability has really taken off? A few years ago, the category barely existed – what’s changed? There’s a larger debate within

Broken data is costly, time-consuming, and nowadays, an all-too-common reality for even the most advanced data teams. In this talk, I’ll introduce this problem, called “data downtime” — periods of time when data is partial, erroneous, missing or otherwise inaccurate — and discuss how to eliminate it in your data ecosystem with end-to-end data observability. Drawing corollaries to application observability in software engineering, data observability is a critical component of the modern DataOps workflow and the key to ensuring data trust at scale. I’ll share why data observability matters when it comes to building a better data quality strategy and highlight tactics you can use to address it today.

In this episode of DataFramed, Adel speaks with Barr Moses, CEO, and co-founder of Monte Carlo on the importance of data quality and how data observability creates trust in data throughout the organization. 

Throughout the episode, Barr talks about her background, the state of data-driven organizations and what it means to be data-driven, the data maturity of organizations, the importance of data quality, what data observability is, and why we’ll hear about it more often in the future. She also covers the state of data infrastructure, data meshes, and more. 

Relevant links from the interview:

Connect with Barr on LinkedInLearn more about data meshesCheck out the Monte Carlo blogDataCamp's Guide to Organizational Data Maturity

We covered:

Barr’s background Market gaps in data reliability Observability in engineering Data downtime Data quality problems and the five pillars of data observability Example: job failing because of a schema change Three pillars of observability (good pipelines and bad data) Observability vs monitoring Finding the root cause Who is accountable for data quality? (the RACI framework) Service level agreements Inferring the SLAs from the historical data Implementing data observability Data downtime maturity curve Monte carlo: data observability solution Open source tools Test-driven development for data Is data observability cloud agnostic? Centralizing data observability Detecting downstream and upstream data usage Getting bad data vs getting unusual data

Links:

Learn more about Monte Carlo: https://www.montecarlodata.com/ The Data Engineer's Guide to Root Cause Analysis: https://www.montecarlodata.com/the-data-engineers-guide-to-root-cause-analysis/ Why You Need to Set SLAs for Your Data Pipelines: https://www.montecarlodata.com/how-to-make-your-data-pipelines-more-reliable-with-slas/ Data Observability: The Next Frontier of Data Engineering: https://www.montecarlodata.com/data-observability-the-next-frontier-of-data-engineering/ To get in touch with Barr, ping her in the DataTalks.Club group or use [email protected]

Join DataTalks.Club: https://datatalks.club/slack.html

Summary In order for analytics and machine learning projects to be useful, they require a high degree of data quality. To ensure that your pipelines are healthy you need a way to make them observable. In this episode Barr Moses and Lior Gavish, co-founders of Monte Carlo, share the leading causes of what they refer to as data downtime and how it manifests. They also discuss methods for gaining visibility into the flow of data through your infrastructure, how to diagnose and prevent potential problems, and what they are building at Monte Carlo to help you maintain your data’s uptime.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management What are the pieces of advice that you wish you had received early in your career of data engineering? If you hand a book to a new data engineer, what wisdom would you add to it? I’m working with O’Reilly on a project to collect the 97 things that every data engineer should know, and I need your help. Go to dataengineeringpodcast.com/97things to add your voice and share your hard-earned expertise. When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Are you bogged down by having to manually manage data access controls, repeatedly move and copy data, and create audit reports to prove compliance? How much time could you save if those tasks were automated across your cloud platforms? Immuta is an automated data governance solution that enables safe and easy data analytics in the cloud. Our comprehensive data-level security, auditing and de-identification features eliminate the need for time-consuming manual processes and our focus on data and compliance team collaboration empowers you to deliver quick and valuable data analytics on the most sensitive data to unlock the full potential of your cloud data platforms. Learn how we streamline and accelerate manual processes to help you derive real results from your data at dataengineeringpodcast.com/immuta. Today’s episode of the Data Engineering Podcast is sponsored by Datadog, a SaaS-based monitoring and analytics platform for cloud-scale infrastructure, applications, logs, and more. Datadog uses machine-learning based algorithms to detect errors and anomalies across your entire stack—which reduces the time it takes to detect and address outages and helps promote collaboration between Data Engineering, Operations, and the rest of the company. Go to dataengineeringpodcast.com/datadog today to start your free 14 day trial. If you start a trial and install Datadog’s agent, Datadog will send you a free T-shirt. You listen to this show to learn and stay up to date with what’s happening in databases, streaming platforms, big data, and everything else you need to know about modern data platforms. For more opportunities to stay up to date, gain new skills, and learn from your peers there are a growing number of virtual events that you can attend from the comfort and safety of your home. Go to dataengineeringpodcast.com/conferences to check out the upcoming events being offered by our partners and get registered today! Your host is Tobias Macey and today I’m interviewing Barr Moses and Lior Gavish about observability for your data pipelines and how they are addressing it at Monte Carlo.

Interview

Introduction How did you get involved in the area of data management? H