talk-data.com talk-data.com

Topic

Data Governance

data_management compliance data_quality

417

tagged

Activity Trend

90 peak/qtr
2020-Q1 2026-Q1

Activities

417 activities · Newest first

Logical representations of data rather than a deep technical understanding of a database or dataset are key for the democratisation of, and ease of access to, data going forward. In this session, Steve will give his thoughts on what is driving this need and how organisations can approach it in a pragmatic way, focusing on the links between good data governance, good data tech, good data analytics and ultimately good data-led decision making.

Data governance is more important than ever today, but it's something a lot of companies still struggle with.  During this live show, George Firican will demystify data governance, breaking down its core components and sharing practical advice that you can use to make improvements at your organization.    What You'll Learn: Why data governance is more important than ever in 2024 The core pillars of a strong data governance program  Practical tips for launching a new data governance program   Register for free to be part of the next live session: https://bit.ly/3XB3A8b   About our guest: George Firican is the Founder of LightsOnData, and a Data Governance expert and course creator. The Practical Data Governance: Implementation Course  Subscribe to George's YouTube Channel Follow George on LinkedIn

Follow us on Socials: LinkedIn YouTube Instagram (Mavens of Data) Instagram (Maven Analytics) TikTok Facebook Medium X/Twitter

There’s been a lot of pressure to add AI to almost every digital tool and service recently, and two years into the AI hype cycle, we’re seeing two types of problems. The first is organizations that haven’t done much yet with AI because they don’t know where to start. The second is organizations that rushed into AI and failed because they didn’t know what they were doing. Both are symptoms of the same problem: not having an AI strategy and not understanding how to tactically implement AI. There’s a lot to consider around choosing the right project and putting processes and skilled talent in place, not to mention worrying about costs and return on investment. Tathagat Varma is the Global TechOps Leader at Walmart Global Tech. Tathagat is responsible for leading strategic business initiatives, enterprise agile transformation, technical learning and enablement, strategic technical initiatives, startup ecosystem engagement, and internal events across Walmart Global Tech. He also provides support to horizontal technical and internal innovation programs in the company. Starting as a Computer Scientist with DRDO, and with an overall experience of 27 years, Tathagat has played significant technical and leadership roles in establishing and growing organizations like NerdWallet, ChinaSoft International, McAfee, Huawei, Network General, NetScout System, [24]7 Innovations Labs and Yahoo!, and played key engineering roles at Siemens and Philips. In the episode, Richie and Tathagat explore failures in AI adoption, the role of leadership in AI adoption, AI strategy and business objective alignment, investment and timeline for AI projects, identifying starter AI projects, skills for AI success, building a culture of AI adoption, the potential of AI and much more.  Links Mentioned in the Show: Walmart Global TechConnect with Tathagat[Course] Data Governance ConceptsRelated Episode: How Walmart Leverages Data & AI with Swati Kirti, Sr Director of Data Science at WalmartRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

One of the most annoying conversations about data that happens far too often is: “Can you do an analysis and answer this business problem for me?” “Sure, where’s the data?” “I don’t know. Probably in one of our databases.” At this point more time is spent hunting for data than actually analyzing it. Rather than grumbling about it, it would obviously be more productive to learn how to solve data discoverability issues. What’s the best way to properly document data sets? How can you avoid spending all your time maintaining dashboards that no one actually uses?  Shinji Kim is the Founder & CEO of Select Star, an automated data discovery platform that helps you understand your data. Previously, she was the CEO of Concord Systems (concord.io), a NYC-based data infrastructure startup acquired by Akamai Technologies in 2016. She led building Akamai’s new IoT data platform for real-time messaging, log processing, and edge computing. Prior to Concord, Shinji was the first Product Manager hired at Yieldmo, where she led the Ad Format Lab, A/B testing, and yield optimization. Before Yieldmo, she was analyzing data and building enterprise applications at Deloitte Consulting, Facebook, Sun Microsystems, and Barclays Capital. Shinji studied Software Engineering at University of Waterloo and General Management at Stanford GSB. She advises early stage startups on product strategy, customer development, and company building. In the episode, Richie and Shinji explore the importance of data governance, the utilization of data, data quality, challenges in data usage, why documentation matters, metadata and data lineage, improving collaboration between data and business teams, data governance trends to look forward to, and much more.  Links Mentioned in the Show: Select StarConnect with Shinji[Course] Data Governance ConceptsRelated Episode: Making Data Governance Fun with Tiankai Feng, Data Strategy & Data Governance Lead at ThoughtWorksRewatch sessions from RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

Practical Lakehouse Architecture

This concise yet comprehensive guide explains how to adopt a data lakehouse architecture to implement modern data platforms. It reviews the design considerations, challenges, and best practices for implementing a lakehouse and provides key insights into the ways that using a lakehouse can impact your data platform, from managing structured and unstructured data and supporting BI and AI/ML use cases to enabling more rigorous data governance and security measures. Practical Lakehouse Architecture shows you how to: Understand key lakehouse concepts and features like transaction support, time travel, and schema evolution Understand the differences between traditional and lakehouse data architectures Differentiate between various file formats and table formats Design lakehouse architecture layers for storage, compute, metadata management, and data consumption Implement data governance and data security within the platform Evaluate technologies and decide on the best technology stack to implement the lakehouse for your use case Make critical design decisions and address practical challenges to build a future-ready data platform Start your lakehouse implementation journey and migrate data from existing systems to the lakehouse

Databricks Customers at Data + AI Summit

At this year's event, over 250 customers shared their data and AI journies. They showcased a wide variety of use cases, best practices and lessons from their leadership and innovation with the latest data and AI technologies.

See how enterprises are leveraging generative AI in their data operations and how innovative data management and data governance are fueling organizations as they race to develop GenAI applications. https://www.databricks.com/blog/how-real-world-enterprises-are-leveraging-generative-ai

To see more real-world use cases and customer success stories, visit: https://www.databricks.com/customers

Summary This episode features an insightful conversation with Petr Janda, the CEO and founder of Synq. Petr shares his journey from being an engineer to founding Synq, emphasizing the importance of treating data systems with the same rigor as engineering systems. He discusses the challenges and solutions in data reliability, including the need for transparency and ownership in data systems. Synq's platform helps data teams manage incidents, understand data dependencies, and ensure data quality by providing insights and automation capabilities. Petr emphasizes the need for a holistic approach to data reliability, integrating data systems into broader business processes. He highlights the role of data teams in modern organizations and how Synq is empowering them to achieve this. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.Your host is Tobias Macey and today I'm interviewing Petr Janda about Synq, a data reliability platform focused on leveling up data teams by supporting a culture of engineering rigorInterview IntroductionHow did you get involved in the area of data management?Can you describe what Synq is and the story behind it? Data observability/reliability is a category that grew rapidly over the past ~5 years and has several vendors focused on different elements of the problem. What are the capabilities that you saw as lacking in the ecosystem which you are looking to address?Operational/infrastructure engineers have spent the past decade honing their approach to incident management and uptime commitments. How do those concepts map to the responsibilities and workflows of data teams? Tooling only plays a small part in SLAs and incident management. How does Synq help to support the cultural transformation that is necessary?What does an on-call rotation for a data engineer/data platform engineer look like as compared with an application-focused team?How does the focus on data assets/data products shift your approach to observability as compared to a table/pipeline centric approach?With the focus on sharing ownership beyond the boundaries on the data team there is a strong correlation with data governance principles. How do you see organizations incorporating Synq into their approach to data governance/compliance?Can you describe how Synq is designed/implemented? How have the scope and goals of the product changed since you first started working on it?For a team who is onboarding onto Synq, what are the steps required to get it integrated into their technology stack and workflows?What are the types of incidents/errors that you are able to identify and alert on? What does a typical incident/error resolution process look like with Synq?What are the most interesting, innovative, or unexpected ways that you have seen Synq used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Synq?When is Synq the wrong choice?What do you have planned for the future of Synq?Contact Info LinkedInSubstackParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links SynqIncident ManagementSLA == Service Level AgreementData GovernancePodcast EpisodePagerDutyOpsGenieClickhousePodcast EpisodedbtPodcast EpisodeSQLMeshPodcast EpisodeThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Data + AI Summit Keynote Day 1 - Full
video
by Patrick Wendall (Databricks) , Fei-Fei Li (Stanford University) , Brian Ames (General Motors) , Ken Wong (Databricks) , Ali Ghodsi (Databricks) , Jackie Brosamer (Block) , Reynold Xin (Databricks) , Jensen Huang (NVIDIA)

Databricks Data + AI Summit 2024 Keynote Day 1

Experts, researchers and open source contributors — from Databricks and across the data and AI community gathered in San Francisco June 10 - 13, 2024 to discuss the latest technologies in data management, data warehousing, data governance, generative AI for the enterprise, and data in the era of AI.

Hear from Databricks Co-founder and CEO Ali Ghodsi on building generative AI applications, putting your data to work, and how data + AI leads to data intelligence.

Plus a fireside chat between Ali Ghodsi and Nvidia Co-founder and CEO, Jensen Huang, on the expanded partnership between Nvidia and Databricks to accelerate enterprise data for the era of generative AI

Product announcements in the video include: - Databricks Data Intelligence Platform - Native support for NVIDIA GPU acceleration on the Databricks Data Intelligence Platform - Databricks open source model DBRX available as an NVIDIA NIM microservice - Shutterstock Image AI powered by Databricks - Databricks AI/BI - Databricks LakeFlow - Databricks Mosaic AI - Mosaic AI Agent Framework - Mosaic AI Agent Evaluation - Mosaic AI Tools Catalog - Mosaic AI Model Training - Mosaic AI Gateway

In this keynote hear from: - Ali Ghodsi, Co-founder and CEO, Databricks (1:45) - Brian Ames, General Motors (29:55) - Patrick Wendall, Co-founder and VP of Engineering, Databricks (38:00) - Jackie Brosamer, Head of AI, Data and Analytics, Block (1:14:42) - Fei Fei Li, Professor, Stanford University and Denning Co-Director, Stanford Institute for Human-Centered AI (1:23:15) - Jensen Huang, Co-founder and CEO of NVIDIA with Ali Ghodsi, Co-founder and CEO of Databricks (1:42:27) - Reynold Xin, Co-founder and Chief Architect, Databricks (2:07:43) - Ken Wong, Senior Director, Product Management, Databricks (2:31:15) - Ali Ghodsi, Co-founder and CEO, Databricks (2:48:16)

In the fast-paced work environments we are used to, the ability to quickly find and understand data is essential. Data professionals can often spend more time searching for data than analyzing it, which can hinder business progress. Innovations like data catalogs and automated lineage systems are transforming data management, making it easier to ensure data quality, trust, and compliance. By creating a strong metadata foundation and integrating these tools into existing workflows, organizations can enhance decision-making and operational efficiency. But how did this all come to be, who is driving better access and collaboration through data? Prukalpa Sankar is the Co-founder of Atlan. Atlan is a modern data collaboration workspace (like GitHub for engineering or Figma for design). By acting as a virtual hub for data assets ranging from tables and dashboards to models & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Slack, BI tools, data science tools and more. A pioneer in the space, Atlan was recognized by Gartner as a Cool Vendor in DataOps, as one of the top 3 companies globally. Prukalpa previously co-founded SocialCops, world leading data for good company (New York Times Global Visionary, World Economic Forum Tech Pioneer). SocialCops is behind landmark data projects including India’s National Data Platform and SDGs global monitoring in collaboration with the United Nations. She was awarded Economic Times Emerging Entrepreneur for the Year, Forbes 30u30, Fortune 40u40, Top 10 CNBC Young Business Women 2016, and a TED Speaker. In the episode, Richie and Prukalpa explore challenges within data discoverability, the inception of Atlan, the importance of a data catalog, personalization in data catalogs, data lineage, building data lineage, implementing data governance, human collaboration in data governance, skills for effective data governance, product design for diverse audiences, regulatory compliance, the future of data management and much more.  Links Mentioned in the Show: AtlanConnect with Prukalpa[Course] Artificial Intelligence (AI) StrategyRelated Episode: Adding AI to the Data Warehouse with Sridhar Ramaswamy, CEO at SnowflakeSign up to RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

Summary

Modern businesses aspire to be data driven, and technologists enjoy working through the challenge of building data systems to support that goal. Data governance is the binding force between these two parts of the organization. Nicola Askham found her way into data governance by accident, and stayed because of the benefit that she was able to provide by serving as a bridge between the technology and business. In this episode she shares the practical steps to implementing a data governance practice in your organization, and the pitfalls to avoid.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. This episode is supported by Code Comments, an original podcast from Red Hat. As someone who listens to the Data Engineering Podcast, you know that the road from tool selection to production readiness is anything but smooth or straight. In Code Comments, host Jamie Parker, Red Hatter and experienced engineer, shares the journey of technologists from across the industry and their hard-won lessons in implementing new technologies. I listened to the recent episode "Transforming Your Database" and appreciated the valuable advice on how to approach the selection and integration of new databases in applications and the impact on team dynamics. There are 3 seasons of great episodes and new ones landing everywhere you listen to podcasts. Search for "Code Commentst" in your podcast player or go to dataengineeringpodcast.com/codecomments today to subscribe. My thanks to the team at Code Comments for their support. Your host is Tobias Macey and today I'm interviewing Nicola Askham about the practical steps of building out a data governance practice in your organization

Interview

Introduction How did you get involved in the area of data management? Can you start by giving an overview of the scope and boundaries of data governance in an organization?

At what point does a lack of an explicit governance policy become a liability?

What are some of the misconceptions that you encounter about data governance? What impact has the evolution of data technologies had on the implementation of governance practices? (e.g. number/scale of systems, types of data, AI) Data governance can often become an exercise in boiling the ocean. What are the concrete first steps that will increase the success rate of a governance practice?

Once a data governance project is underway, what are some of the common roadblocks that might derail progress?

What are the net benefits to the data team and the organization when a data governance practice is established, active, and healthy? What are the most interesting, innovative, or unexpected ways that you have seen data governance applied? What are the most interesting, unexpected, or challenging lessons that you have learned while working on data governance/training/coaching? What are some of the pitfalls in data governance? What are some of the future trends in data governance that you are excited by?

Are there any trends that concern you?

Contact Info

Website LinkedIn

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is

Data Engineering with Databricks Cookbook

In "Data Engineering with Databricks Cookbook," you'll learn how to efficiently build and manage data pipelines using Apache Spark, Delta Lake, and Databricks. This recipe-based guide offers techniques to transform, optimize, and orchestrate your data workflows. What this Book will help me do Master Apache Spark for data ingestion, transformation, and analysis. Learn to optimize data processing and improve query performance with Delta Lake. Manage streaming data processing with Spark Structured Streaming capabilities. Implement DataOps and DevOps workflows tailored for Databricks. Enforce data governance policies using Unity Catalog for scalable solutions. Author(s) Pulkit Chadha, the author of this book, is a Senior Solutions Architect at Databricks. With extensive experience in data engineering and big data applications, he brings practical insights into implementing modern data solutions. His educational writings focus on empowering data professionals with actionable knowledge. Who is it for? This book is ideal for data engineers, data scientists, and analysts who want to deepen their knowledge in managing and transforming large datasets. Readers should have an intermediate understanding of SQL, Python programming, and basic data architecture concepts. It is especially well-suited for professionals working with Databricks or similar cloud-based data platforms.

Everything in the world has a price, including improving and scaling your data and AI functions. That means that at some point someone will question the ROI of your projects, and often, these projects will be looked at under the lens of monetization. But how do you ensure that what you’re working on is not only providing value to the business but also creating financial gain? What conditions need to be met to prove your project's success and turn value into cash? Vin Vashishta is the author of ‘From Data to Profit’ (Wiley), the playbook for monetizing data and AI. He built V-Squared from client 1 to one of the oldest data and AI consulting firms. For the last eight years, he has been recognized as a data and AI thought leader. Vin is a LinkedIn Top Voice and Gartner Ambassador. His background spans over 25 years in strategy, leadership, software engineering, and applied machine learning. Dr. Tiffany Perkins-Munn is on a mission to bring research, analytics, and data science to life. She earned her Ph.D. in Social-Personality Psychology with an interdisciplinary focus on Advanced Quantitative Methods. Her insights are the subject of countless lectures on psychology, statistics, and their real-world applications. As the Head of Data and Analytics for the innovative CDAO organization at J.P. Morgan Chase, her knack involves unraveling complex business problems through operational enhancements, augmented financials, and intuitive recruiting. After over two decades in the industry, she consistently forges robust relationships across the corporate spectrum, becoming one of the Top 10 Finalists in the Merrill Lynch Global Markets Innovation Program. In the episode, Richie, Vin, and Tiffany explore the challenges of monetizing data and AI projects, including how technical, organizational, and strategic factors affect your input, the importance of aligning technical and business objectives to keep outputs focused on core business goals, how to assess your organization's data and AI maturity, examples of high data maturity businesses, data security and compliance, quick wins in data transformation and infrastructure, why long-term vision and strategy matter, and much more. Links Mentioned in the Show: Connect with Tiffany on LinkedinConnect with Vin on LinkedinVin’s Website[Course] Data Governance Concepts Related Episode: Scaling Enterprise Analytics with Libby Duane Adams, Chief Advocacy Officer and Co-Founder of Alteryx New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business

Data Engineering with Google Cloud Platform - Second Edition

Data Engineering with Google Cloud Platform is your ultimate guide to building scalable data platforms using Google Cloud technologies. In this book, you will learn how to leverage products such as BigQuery, Cloud Composer, and Dataplex for efficient data engineering. Expand your expertise and gain practical knowledge to excel in managing data pipelines within the Google Cloud ecosystem. What this Book will help me do Understand foundational data engineering concepts using Google Cloud Platform. Learn to build and manage scalable data pipelines with tools such as Dataform and Dataflow. Explore advanced topics like data governance and secure data handling in Google Cloud. Boost readiness for Google Cloud data engineering certification with real-world exam guidance. Master cost-effective strategies and CI/CD practices for data engineering on Google Cloud. Author(s) Adi Wijaya, the author of this book, is a Data Strategic Cloud Engineer at Google with extensive experience in data engineering and the Google Cloud ecosystem. With his hands-on expertise, he emphasizes practical solutions and in-depth knowledge sharing, guiding readers through the intricacies of Google Cloud for data engineering success. Who is it for? This book is ideal for data analysts, IT practitioners, software engineers, and data enthusiasts aiming to excel in data engineering. Whether you're a beginner tackling fundamental concepts or an experienced professional exploring Google Cloud's advanced capabilities, this book is designed for you. It bridges your current skills with modern data engineering practices on Google Cloud, making it a valuable resource at any stage of your career.

Countless companies invest in their data quality, but often, the effort from their investment is not fully realized in the output. It seems like, despite the critical importance of data quality, data governance might be suffering from a branding issue. Data governance is sometimes looked at as the data police, but this is far from the truth. So, how can we change perspectives and introduce fun into data governance? Tiankai Feng is a Principal Data Consultant and Data Strategy & Data Governance Lead at Thoughtworks, He also works part-time as the Head of Marketing at DAMA Germany. Tiankai has had many data hats in his career—marketing data analyst, data product owner, analytics capability lead, and data governance leader for the last few years. He has found a passion for the human side of data—how to collaborate, coordinate, and communicate around data. TIankai often uses his music and humor to make data more approachable and fun. In the episode, Adel and Tiankai explore the importance of data governance in data-driven organizations, the challenges of data governance, how to define success criteria and measure the ROI of governance initiatives, non-invasive and creative approaches to data governance, the implications of generative AI on data governance, regulatory considerations, organizational culture and much more.  Links Mentioned in the Show: Tiankai’s YouTube ChannelData Governance Fundamentals Cheat Sheet[Webinar] Unpacking the Fun in Data Governance: The Key to Scaling Data Quality[Course] Data Governance ConceptsRewatch sessions from RADAR: The Analytics Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business

Discover the transformative synergy between SAP Datasphere and Google BigQuery, driving data insights. We'll explore Datasphere's transformation, integration, and data governance capabilities alongside Big Query’s scalability and real-time analytics process. Also learn how SAP GenAI Hub and Google Cloud accelerate AI initiatives and innovation. You will also hear real-world success stories on how businesses leverage this integration for tangible outcomes.

By attending this session, your contact information may be shared with the sponsor for relevant follow up for this event only. Please note: seating is limited and on a first-come, first served basis; standing areas are available

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.

Learn from data and AI thought leaders on this panel as they discuss how generative AI has shaped their outlook and approach to data governance. Learn from them as they talk candidly about building a data governance strategy, what they’ve learned so far, and how Google Cloud has helped them.

Click the blue “Learn more” button above to tap into special offers designed to help you implement what you are learning at Google Cloud Next 25.