talk-data.com talk-data.com

Topic

Alation

data_catalog data_governance metadata_management

9

tagged

Activity Trend

3 peak/qtr
2020-Q1 2026-Q1

Activities

9 activities · Newest first

As organisations scale AI and move towards Data Products, success depends on trusted, high-quality data underpinned by strong governance. In this fireside chat, Chemist Warehouse shares how domain-aligned metadata, data quality, and governance, powered by Alation, enable a unified delivery framework using Critical Data Elements (CDEs) to reduce risk, drive self-service, and build a foundation for AI-ready analytics and future data product initiatives.

Sponsored by: Alation | Better Together: Enterprise Catalog with Databricks & Alation at American Airlines

In the era of data-driven enterprises, true democratization requires more than just access–it demands context, trust, and governance at scale. In this session, discover how to seamlessly integrate Databricks Unity Catalog with Alation’s Enterprise Data Catalog to deliver: End-to-End Lineage Storytelling: Unify technical and business views into a single, cohesive narrative that resonates with both technical engineers and non-technical stakeholders across business domains Accelerated and Democratized Insights: Automate metadata stitching to reduce time-to-insight, enabling analysts to answer critical business questions faster and drive multi-domain collaboration Empowered, Trustworthy Discovery: Equip business users with a unified platform, populated with rich documentation and usage signals, so they can find, understand, and confidently use trusted data assets

To support its Digital First mission, the BBC is transforming into a data product organisation. This session will explore how the BBC's data strategy is driving a cultural and organisational shift that is evolving its data architecture and embedding data capabilities company wide. Discover the BBC's approach to developing certified, shareable data products that strengthen governance, enable self-service analytics, and establish a foundation for responsible AI use.

Today, we’re joined by Satyen Sangani, CEO & Co-Founder of Alation, a leader in enterprise data intelligence solutions. We talk about:  The role of AI in data managementAccelerating time to informationFascinating use cases resulting from easier knowledge sharingIncreasing the accuracy of AI outputEntrepreneur-market fit and building a new product categoryIs data democratization always useful?

Sponsored by: Alation | Unlocking the Power of Real-Time Data to Maximize Data Insights

It’s no secret that access to the right data at the right time is critical for data-driven decision making. In fact, as data culture becomes more and more ingrained in the enterprise, business users increasingly demand real-time, actionable data. But, what happens when it takes up to 24 hours to access your point-of-sale data? RaceTrac faced many of these data accessibility challenges as it sought to derive intelligence from its retail transaction data, specifically the data from their stores, information from their fuel purchasing arms, and delivery data for their fleet.

Through a combination of the Databricks Lakehouse and the lineage and self-discovery capabilities of the Alation Data Intelligence Platform, RaceTrac rose to the challenge. Hear from Raghu Jayachandran, Senior Manager of Enterprise Data at RaceTrac, and discover how RaceTrac gained real-time access to their transaction data in Databricks, and uses Alation to provide insight into which data can drive the business insights they needed.

Talk by: Diby Malakar and Raghu Jayachandran

Here’s more to explore: Why the Data Lakehouse Is Your next Data Warehouse: https://dbricks.co/3Pt5unq Lakehouse Fundamentals Training: https://dbricks.co/44ancQs

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksin

Sponsored: Matillion | Using Matillion to Boost Productivity w/ Lakehouse and your Full Data Stack

In this presentation, Matillion’s Sarah Pollitt, Group Product Manager for ETL, will discuss how you can use Matillion to load data from popular data sources such as Salesforce, SAP, and over a hundred out-of-the-box connectors into your data lakehouse. You can quickly transform this data using powerful tools like Matillion or dbt, or your own custom notebooks, to derive valuable insights. She will also explore how you can run streaming pipelines to ensure real-time data processing, and how you can extract and manage this data using popular governance tools such as Alation or Collibra, ensuring compliance and data quality. Finally, Sarah will showcase how you can seamlessly integrate this data into your analytics tools of choice, such as Thoughtspot, PowerBI, or any other analytics tool that fits your organization's needs.

Talk by: Rick Wear

Connect with us: Website: https://databricks.com Twitter: https://twitter.com/databricks LinkedIn: https://www.linkedin.com/company/databricks Instagram: https://www.instagram.com/databricksinc Facebook: https://www.facebook.com/databricksinc

Summary

This podcast started almost exactly six years ago, and the technology landscape was much different than it is now. In that time there have been a number of generational shifts in how data engineering is done. In this episode I reflect on some of the major themes and take a brief look forward at some of the upcoming changes.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management Your host is Tobias Macey and today I'm reflecting on the major trends in data engineering over the past 6 years

Interview

Introduction 6 years of running the Data Engineering Podcast Around the first time that data engineering was discussed as a role

Followed on from hype about "data science"

Hadoop era Streaming Lambda and Kappa architectures

Not really referenced anymore

"Big Data" era of capture everything has shifted to focusing on data that presents value

Regulatory environment increases risk, better tools introduce more capability to understand what data is useful

Data catalogs

Amundsen and Alation

Orchestration engine

Oozie, etc. -> Airflow and Luigi -> Dagster, Prefect, Lyft, etc. Orchestration is now a part of most vertical tools

Cloud data warehouses Data lakes DataOps and MLOps Data quality to data observability Metadata for everything

Data catalog -> data discovery -> active metadata

Business intelligence

Read only reports to metric/semantic layers Embedded analytics and data APIs

Rise of ELT

dbt Corresponding introduction of reverse ETL

What are the most interesting, unexpected, or challenging lessons that you have learned while working on running the podcast? What do you have planned for the future of the podcast?

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA Sponsored By: Materialize: Materialize

Looking for the simplest way to get the freshest data possible to your teams? Because let's face it: if real-time were easy, everyone would be using it. Look no further than Materialize, the streaming database you already know how to use.

Materialize’s PostgreSQL-compatible interface lets users leverage the tools they already use, with unsurpassed simplicity enabled by full ANSI SQL support. Delivered as a single platform with the separation of storage and compute, strict-serializability, active replication, horizontal scalability and workload isolation — Materialize is now the fastest way to build products with streaming data, drastically reducing the time, expertise, cost and maintenance traditionally associated with implementation of real-time features.

Sign up now for early access to Materialize and get started with the power of streaming data with the same simplicity and low implementation cost as batch cloud data warehouses.

Go to materialize.comSupport Data Engineering Podcast

On today’s episode, we’re joined by John Wills. John is the Field CTO at Alation, a data intelligence company that helps organizations find, understand and trust data.

We talk about:

  • John’s background and Alation.
  • Cataloging data within an organization.
  • How developers can access and use cataloged data.
  • Will data become more and more critical for organizations?
  • The friction between business growth and regulatory compliance.
  • The increasing complexity of data and how this impacts cataloging.
  • Different types of data marketplaces and the exchange between them.
  • The impact of machine learning and artificial intelligence on data cataloging.

John Wills - https://www.linkedin.com/in/johnwwills/ Alation - https://www.linkedin.com/company/alation/

This episode is brought to you by Qrvey

The tools you need to take action with your data, on a platform built for maximum scalability, security, and cost efficiencies. If you’re ready to reduce complexity and dramatically lower costs, contact us today at qrvey.com.

Qrvey, the modern no-code analytics solution for SaaS companies on AWS.

saas #analytics #AWS #BI

Summary The landscape of data management and processing is rapidly changing and evolving. There are certain foundational elements that have remained steady, but as the industry matures new trends emerge and gain prominence. In this episode Astasia Myers of Redpoint Ventures shares her perspective as an investor on which categories she is paying particular attention to for the near to medium term. She discusses the work being done to address challenges in the areas of data quality, observability, discovery, and streaming. This is a useful conversation to gain a macro perspective on where businesses are looking to improve their capabilities to work with data.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management What are the pieces of advice that you wish you had received early in your career of data engineering? If you hand a book to a new data engineer, what wisdom would you add to it? I’m working with O’Reilly on a project to collect the 97 things that every data engineer should know, and I need your help. Go to dataengineeringpodcast.com/97things to add your voice and share your hard-earned expertise. When you’re ready to build your next pipeline, or want to test out the projects you hear about on the show, you’ll need somewhere to deploy it, so check out our friends at Linode. With their managed Kubernetes platform it’s now even easier to deploy and scale your workflows, or try out the latest Helm charts from tools like Pulsar to get you up and running in no time. With simple pricing, fast networking, S3 compatible object storage, and worldwide data centers, you’ve got everything you need to run a bulletproof data platform. Go to dataengineeringpodcast.com/linode today and get a $60 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show! You listen to this show because you love working with data and want to keep your skills up to date. Machine learning is finding its way into every aspect of the data landscape. Springboard has partnered with us to help you take the next step in your career by offering a scholarship to their Machine Learning Engineering career track program. In this online, project-based course every student is paired with a Machine Learning expert who provides unlimited 1:1 mentorship support throughout the program via video conferences. You’ll build up your portfolio of machine learning projects and gain hands-on experience in writing machine learning algorithms, deploying models into production, and managing the lifecycle of a deep learning prototype. Springboard offers a job guarantee, meaning that you don’t have to pay for the program until you get a job in the space. The Data Engineering Podcast is exclusively offering listeners 20 scholarships of $500 to eligible applicants. It only takes 10 minutes and there’s no obligation. Go to dataengineeringpodcast.com/springboard and apply today! Make sure to use the code AISPRINGBOARD when you enroll. Your host is Tobias Macey and today I’m interviewing Astasia Myers about the trends in the data industry that she sees as an investor at Redpoint Ventures

Interview

Introduction How did you get involved in the area of data management? Can you start by giving an overview of Redpoint Ventures and your role there? From an investor perspective, what is most appealing about the category of data-oriented businesses? What are the main sources of information that you rely on to keep up to date with what is happening in the data industry?

What is your personal heuristic for determining the relevance of any given piece of information to decide whether it is worthy of further investigation?

As someone who works closely with a variety of companies across different industry verticals and different areas of focus, what are some of the common trends that you have identified in the data ecosystem? In your article that covers the trends you are keeping an eye on for 2020 you call out 4 in particular, data quality, data catalogs, observability of what influences critical business indicators, and streaming data. Taking those in turn:

What are the driving factors that influence data quality, and what elements of that problem space are being addressed by the companies you are watching?

What are the unsolved areas that you see as being viable for newcomers?

What are the challenges faced by businesses in establishing and maintaining data catalogs?

What approaches are being taken by the companies who are trying to solve this problem?

What shortcomings do you see in the available products?

For gaining visibility into the forces that impact the key performance indicators (KPI) of businesses, what is lacking in the current approaches?

What additional information needs to be tracked to provide the needed context for making informed decisions about what actions to take to improve KPIs? What challenges do businesses in this observability space face to provide useful access and analysis to this collected data?

Streaming is an area that has been growing rapidly over the past few years, with many open source and commercial options. What are the major business opportunities that you see to make streaming more accessible and effective?

What are the main factors that you see as driving this growth in the need for access to streaming data?

With your focus on these trends, how does that influence your investment decisions and where you spend your time? What are the unaddressed markets or product categories that you see which would be lucrative for new businesses? In most areas of technology now there is a mix of open source and commercial solutions to any given problem, with varying levels of maturity and polish between them. What are your views on the balance of this relationship in the data ecosystem?

For data in particular, there is a strong potential for vendor lock-in which can cause potential customers to avoid adoption of commercial solutions. What has been your experience in that regard with the companies that you work with?

Contact Info

@AstasiaMyers on Twitter @astasia on Medium LinkedIn

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don’t forget to check out our other show, Podcast.init to learn about the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Join the community in the new Zulip chat workspace at dataengineeringpodcast.com/chat

Links

Redpoint Ventures 4 Data Trends To Watch in 2020 Seagate Western Digital Pure Storage Cisco Cohesity Looker

Podcast Episode

DGraph

Podcast Episode

Dremio

Podcast Episode

SnowflakeDB

Podcast Episode

Thoughspot Tibco Elastic Splunk Informatica Data Council DataCoral Mattermost Bitwarden Snowplow

Podcast Interview Interview About Snowplow Infrastructure

CHAOSSEARCH

Podcast Episode

Kafka Streams Pulsar

Podcast Interview Followup Podcast Interview

Soda Toro Great Expectations Alation Collibra Amundsen DataHub Netflix Metacat Marquez

Podcast Episode

LDAP == Lightweight Directory Access Protocol Anodot Databricks Flink

a…