talk-data.com talk-data.com

Prukalpa Sankar

Speaker

Prukalpa Sankar

13

talks

Co-founder Atlan

Prukalpa Sankar is the co-founder of Atlan, a modern collaborative data workspace. She has previously co-founded a leading data for good company where she led data teams for organizations like the UN, the World Bank, and several large governments.

Bio from: dbt Coalesce 2022

Frequent Collaborators

Filter by Event / Source

Talks & appearances

13 activities · Newest first

Search activities →
Sponsored by: Atlan | How Fox & Atlan are Partnering to Make Metadata a Common System of Trust, Context, and Governance

With hundreds of millions viewing broadcasts from news to sports, Fox relies on a sophisticated and trusted architecture ingesting 100+ data sources, carefully governed to improve UX across products, drive sales and marketing, and ensure KPI tracking. Join Oliver Gomes, VP of Enterprise and Data Platform at Fox, and Prukalpa Sankar of Atlan to learn how true partnership helps their team navigate opportunities from Governance to AI. To govern and democratize their multi-cloud data platform, Fox chose Atlan to make data accessible and understandable for more users than ever before. Their team then used a data product approach to create a shared language using context from sources like Unity Catalog at a single point of access, no matter the underlying technology. Now, Fox is defining an ambitious future for Metadata. With Atlan and Iceberg driving interoperability, their team prepares to build a “control plane”, creating a common system of trust and governance.

With over 45 million connections across broadband, mobile, TV, and home phone, VMO2 is on a mission to upgrade the UK. Hard at work making their data trustworthy and accessible, their data team is now ensuring nearly 16,000 colleagues can make the most of it.
Join Mauro Flores, EVP, Data Democratisation at VMO2, and Prukalpa Sankar, Co-founder of Atlan, for a behind-the-scenes look at how they’ve transformed Data Governance into a business enabler, taking an AI- and domain-based approach to build a single definition of trustworthy data, and unleashing it for end-users through data products.

Panel: Shift Left Across the Data Lifecycle—Data Contracts, Transformations, Observability, and C...

Panel: Shift Left Across the Data Lifecycle—Data Contracts, Transformations, Observability, and Catalogs | Prukalpa Sankar, Tristan Handy, Barr Moses, Chad Sanderson | Shift Left Data Conference 2025

Join industry-leading CEOs Chad (Data Contracts), Tristan (Data Transformations), Barr (Data Observability), and Prukalpa (Data Catalogs) who are pioneering new approaches to operationalizing data by “Shifting Left.” This engaging panel will explore how embedding rigorous data management practices early in the data lifecycle reduces issues downstream, enhances data reliability, and empowers software engineers with clear visibility into data expectations. Attendees will gain insights into how data contracts define accountability, how effective transformations ensure data usability at scale, how proactive how proactive data and AI observability drives continuous confidence in data quality, and how catalogs enable data discoverability, accelerating innovation and trust across organizations.

Generative AI's transformative power underscores the critical need for high-quality data. In this session, Barr Moses, CEO of Monte Carlo Data, Prukalpa Sankar, Cofounder at Atlan, and George Fraser, CEO at Fivetran, discuss the nuances of scaling data quality for generative AI applications, highlighting the unique challenges and considerations that come into play. Throughout the session, they share best practices for data and AI leaders to navigate these challenges, ensuring that governance remains a focal point even amid the AI hype cycle. Links Mentioned in the Show: Rewatch Session from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

In the fast-paced work environments we are used to, the ability to quickly find and understand data is essential. Data professionals can often spend more time searching for data than analyzing it, which can hinder business progress. Innovations like data catalogs and automated lineage systems are transforming data management, making it easier to ensure data quality, trust, and compliance. By creating a strong metadata foundation and integrating these tools into existing workflows, organizations can enhance decision-making and operational efficiency. But how did this all come to be, who is driving better access and collaboration through data? Prukalpa Sankar is the Co-founder of Atlan. Atlan is a modern data collaboration workspace (like GitHub for engineering or Figma for design). By acting as a virtual hub for data assets ranging from tables and dashboards to models & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Slack, BI tools, data science tools and more. A pioneer in the space, Atlan was recognized by Gartner as a Cool Vendor in DataOps, as one of the top 3 companies globally. Prukalpa previously co-founded SocialCops, world leading data for good company (New York Times Global Visionary, World Economic Forum Tech Pioneer). SocialCops is behind landmark data projects including India’s National Data Platform and SDGs global monitoring in collaboration with the United Nations. She was awarded Economic Times Emerging Entrepreneur for the Year, Forbes 30u30, Fortune 40u40, Top 10 CNBC Young Business Women 2016, and a TED Speaker. In the episode, Richie and Prukalpa explore challenges within data discoverability, the inception of Atlan, the importance of a data catalog, personalization in data catalogs, data lineage, building data lineage, implementing data governance, human collaboration in data governance, skills for effective data governance, product design for diverse audiences, regulatory compliance, the future of data management and much more.  Links Mentioned in the Show: AtlanConnect with Prukalpa[Course] Artificial Intelligence (AI) StrategyRelated Episode: Adding AI to the Data Warehouse with Sridhar Ramaswamy, CEO at SnowflakeSign up to RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

WARNING: This episode contains detailed discussion of data contracts. The modern data stack introduces challenges in terms of collaboration between data producers and consumers. How might we solve them to ultimately build trust in data quality? Chad Sanderson leads the data platform team at Convoy, a late-stage series-E freight technology startup. He manages everything from instrumentation and data ingestion to ETL, in addition to the metrics layer, experimentation software and ML.  Prukalpa Sankar is a co-founder of Atlan, where she develops products that enable improved collaboration between diverse users like businesses, analysts, and engineers, creating higher efficiency and agility in data projects.  For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.  The Analytics Engineering Podcast is sponsored by dbt Labs.

Beyond the buzz: 20 real metadata use cases in 20 minutes with Atlan and dbt Labs

for a few use cases like static and passive data catalogs. However, active metadata can be the key to unlock a variety of use cases, acting as the glue that binds together our diverse modern data stacks (e.g. dbt, Snowflake, Fivetran, Databricks, Looker, and Tableau) and diverse teams (e.g. analytics engineers, data analysts, data engineers, and business users)! At Atlan, we’ve worked closely with modern data teams like WeWork, Plaid, PayU, SnapCommerce, and Bestow. In this session, we’ll lay out all our learnings about how real-life data teams are using metadata to drive powerful use cases like column-level lineage, programmatic governance, root cause analysis, proactive upstream alerts, dynamic pipeline optimization, cost optimization, data deprecation, automated quality control, metrics management, and more. P.S. We’ll also reveal how active metadata and the dbt Semantic Layer can work together to transform the way your team works with metrics!

Check the slides here: https://docs.google.com/presentation/d/1xrC9yhHOQ00qWt-gVlgbakRELg2FzEPt-RwMsUWzdZA/edit?usp=sharing

Coalesce 2023 is coming! Register for free at https://coalesce.getdbt.com/.

More Context, Less Chaos: How Atlan and Unity Catalog Power Column-Level Lineage and Active Metadata

“What does this mean? Who created it? How is it being used? Is it up to date?” Ever fielded these types of questions about your Databricks assets?

Today, context is a huge challenge for data teams. Everyone wants to use your company’s data, but often only a few experts know all of its tribal knowledge and context. The result — they get bombarded with endless questions and requests.

Atlan — the active metadata platform for modern data teams, recently named a Leader in The Forrester Wave: Enterprise Data Catalogs for DataOps — has launched an integration with Databricks Unity Catalog. By connecting to UC’s REST API, Atlan extracts metadata from Databricks clusters and workspaces, generates column-level lineage, and pairs it with metadata from the rest of your data assets to create true end-to-end lineage and visibility across your data stack.

In this session, Prukalpa Sankar (Co-Founder at Atlan and a lifelong data practitioner) and Todd Greenstein (Product Manager with Databricks) will do a live product demo to show how Atlan and Databricks work together to power modern data governance, cataloging, and collaboration.

Connect with us: Website: https://databricks.com Facebook: https://www.facebook.com/databricksinc Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/data... Instagram: https://www.instagram.com/databricksinc/

Summary Metadata is the lifeblood of your data platform, providing information about what is happening in your systems. A variety of platforms have been developed to capture and analyze that information to great effect, but they are inherently limited in their utility due to their nature as storage systems. In order to level up their value a new trend of active metadata is being implemented, allowing use cases like keeping BI reports up to date, auto-scaling your warehouses, and automated data governance. In this episode Prukalpa Sankar joins the show to talk about the work she and her team at Atlan are doing to push this capability into the mainstream.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Go to dataengineeringpodcast.com/linode today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don’t forget to thank them for their continued support of this show! RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their state-of-the-art reverse ETL pipelines enable you to send enriched data to any cloud tool. Sign up free… or just get the free t-shirt for being a listener of the Data Engineering Podcast at dataengineeringpodcast.com/rudder. Data teams are increasingly under pressure to deliver. According to a recent survey by Ascend.io, 95% in fact reported being at or over capacity. With 72% of data experts reporting demands on their team going up faster than they can hire, it’s no surprise they are increasingly turning to automation. In fact, while only 3.5% report having current investments in automation, 85% of data teams plan on investing in automation in the next 12 months. 85%!!! That’s where our friends at Ascend.io come in. The Ascend Data Automation Cloud provides a unified platform for data ingestion, transformation, orchestration, and observability. Ascend users love its declarative pipelines, powerful SDK, elegant UI, and extensible plug-in architecture, as well as its support for Python, SQL, Scala, and Java. Ascend automates workloads on Snowflake, Databricks, BigQuery, and open source Spark, and can be deployed in AWS, Azure, or GCP. Go to dataengineeringpodcast.com/ascend and sign up for a free trial. If you’re a data engineering podcast listener, you get credits worth $5,000 when you become a customer. Today’s episode is Sponsored by Prophecy.io – the low-code data engineering platform for the cloud. Prophecy provides an easy-to-use visual interface to design & deploy data pipelines on Apache Spark & Apache Airflow. Now all the data users can use software engineering best practices – git, tests and continuous deployment with a simple to use visual designer. How does it work? – You visually design the pipelines, and Prophecy generates clean Spark code with tests on git; then you visually schedule these pipelines on Airflow. You can observe your pipelines with built in metadata search and column level lineage. Finally, if you have existing workflows in AbInitio, Informatica or other ETL formats that you want to move to the cloud, you can import them automatically into Prophecy making them run productively on Spark. Create your free account today at dataengineeringpodcast.com/prophecy. Your host is Tobias Macey and today I’m interviewing Prukalpa Sankar about how data platforms can benefit from the idea of "active metadata" and the work that she and her team at Atlan are doing to make it a reality

Interview

Introduction How did you get involved in the area of data management? Can you describe what "active metadata" is and how it differs from the current approaches to metadata systems? What are some of the use cases that "active metadata" can enable for data producers and consumers?

What are the points of friction that those users encounter in the current formulation of metadata systems?

Central metadata systems/data catalogs came about as a solution to the challenge of integrating every data tool with every other data tool, giving a single place to integrate. What are the lessons that are being learned from the "modern data stack" that can be applied to centralized metadata? Can you describe the approach that you are taking at Atlan to enable the adoption of "active metadata"?

What are the architectural capabilities that you had to build to power the outbound traffic flows?

How are you addressing the N x M integration problem for pushing metadata into the necessary contexts at Atlan?

What are the interfaces that are necessary for receiving systems to be able to make use of the metadata that is being delivered? How does the type/category of metadata impact the type of integration that is necessary?

What are some of the automation possibilities that metadata activation offers for data teams?

What are the cases where you still need a human in the loop?

What are the most interesting, innovative, or unexpected ways that you have seen active metadata capabilities used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on activating metadata for your users? When is an active approach to metadata the wrong choice? What do you have planned for the future of Atlan and active metadata?

Contact Info

LinkedIn @prukalpa on Twitter

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don’t forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

Links

Atlan What is Active Metadata? Segment

Podcast Episode

Zapier ArgoCD Kubernetes Wix AWS Lambda Modern Data Culture Blog Post

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Support Data Engineering Podcast

podcast_episode
with Val Kroll , Julie Hoyer , Tim Wilson (Analytics Power Hour - Columbus (OH) , Moe Kiss (Canva) , Michael Helbling (Search Discovery) , Prukalpa Sankar (Atlan)

It's easy to get sucked into the "technology" side of things when it comes to improving the effectiveness and scaling up data teams, but, much to Tim's dismay, shoring up the people, process, and culture is often just as (if not more) critical. So, sure, we can talk about the modern data stack, but what about the modern data CULTURE stack? Prukalpa Sankar, co-founder of Atlan, joined us for a lively discussion of the topic! For complete show notes, including links to items mentioned in this episode and a transcript of the show, visit the show page.

The number of women entering data professions is growing, and men need to adapt. This podcast is designed to enlighten men about the role of women in the data field. Our guests are all executives at data and analytics software companies who have held positions in other sectors of our field: Prukalpa Sankar, Cindi Howson, Debika Sharma.

Summary One of the biggest obstacles to success in delivering data products is cross-team collaboration. Part of the problem is the difference in the information that each role requires to do their job and where they expect to find it. This introduces a barrier to communication that is difficult to overcome, particularly in teams that have not reached a significant level of maturity in their data journey. In this episode Prukalpa Sankar shares her experiences across multiple attempts at building a system that brings everyone onto the same page, ultimately bringing her to found Atlan. She explains how the design of the platform is informed by the needs of managing data projects for large and small teams across her previous roles, how it integrates with your existing systems, and how it can work to bring everyone onto the same page.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar and Pachyderm. With simple pricing, fast networking, object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! Modern Data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days. Datafold helps Data teams gain visibility and confidence in the quality of their analytical data through data profiling, column-level lineage and intelligent anomaly detection. Datafold also helps automate regression testing of ETL code with its Data Diff feature that instantly shows how a change in ETL or BI code affects the produced data, both on a statistical level and down to individual rows and values. Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. Go to dataengineeringpodcast.com/datafold today to start a 30-day trial of Datafold. Once you sign up and create an alert in Datafold for your company data, they will send you a cool water flask. RudderStack’s smart customer data pipeline is warehouse-first. It builds your customer data warehouse and your identity graph on your data warehouse, with support for Snowflake, Google BigQuery, Amazon Redshift, and more. Their SDKs and plugins make event streaming easy, and their integrations with cloud applications like Salesforce and ZenDesk help you go beyond event streaming. With RudderStack you can use all of your customer data to answer more difficult questions and then send those insights to your whole customer data stack. Sign up free at dataengineeringpodcast.com/rudder today. Your host is Tobias Macey and today I’m interviewing Prukalpa Sankar about Atlan, a modern data workspace that makes collaboration among data stakeholders easier, increasing efficiency and agility in data projects

Interview

Introduction How did you get involved in the area of data management? Can you start by giving an overview of what you are building at Atlan and some of the story behind it? Who are the target users of Atlan? What portions of the data workflow is Atlan responsible for?

What components of the data stack might Atlan replace?

How would you characterize Atlan’s position in the current data ecosystem?

What makes Atlan stand out from other systems for data cataloguing, metadata management, or data governance? What types of data assets (e.g. structured vs unstructured, textual