The journey from startup to billion-dollar enterprise requires more than just a great product—it demands strategic alignment between sales and marketing. How do you identify your ideal customer profile when you're just starting out? What data signals help you find the twins of your successful early adopters? With AI now automating everything from competitive analysis to content creation, the traditional boundaries between departments are blurring. But what personality traits should you look for when building teams that can scale with your growth? And how do you ensure your data strategy supports rather than hinders your AI ambitions in this rapidly evolving landscape? Denise Persson is CMO at Snowflake and has 20 years of technology marketing experience at high-growth companies. Prior to joining Snowflake, she served as CMO for Apigee, an API platform company that went public in 2015 and Google acquired in 2016. She began her career at collaboration software company Genesys, where she built and led a global marketing organization. Denise also helped lead Genesys through its expansion to become a successful IPO and acquired company. Denise holds a BA in Business Administration and Economics from Stockholm University, and holds an MBA from Georgetown University. Chris Degnan is the former CRO at Snowflake and has over 15 years of enterprise technology sales experience. Before working at Snowflake, Chris served as the AVP of the West at EMC, and prior to that as VP Western Region at Aveksa, where he helped grow the business 250% year-over-year. Before Aveksa, Chris spent eight years at EMC and managed a team responsible for 175 select accounts. Prior to EMC, Chris worked in enterprise sales at Informatica and Covalent Technologies (acquired by VMware). He holds a BA from the University of Delaware. In the episode, Richie, Denise, and Chris explore the journey to a billion-dollar ARR, the importance of customer obsession, aligning sales and marketing, leveraging data for decision-making, and the role of AI in scaling operations, and much more. Links Mentioned in the Show: SnowflakeSnowflake BUILDConnect with Denise and ChrisSnowflake is FREE on DataCamp this weekRelated Episode: Adding AI to the Data Warehouse with Sridhar Ramaswamy, CEO at SnowflakeRewatch RADAR AI New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business
talk-data.com
Topic
DWH
Data Warehouse
177
tagged
Activity Trend
Top Events
Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Nick Schrock, CTO and founder of Dagster Labs, to discuss Compass - a Slack-native, agentic analytics system designed to keep data teams connected with business stakeholders. Nick shares his journey from initial skepticism to embracing agentic AI as model and application advancements made it practical for governed workflows, and explores how Compass redefines the relationship between data teams and stakeholders by shifting analysts into steward roles, capturing and governing context, and integrating with Slack where collaboration already happens. The conversation covers organizational observability through Compass's conversational system of record, cost control strategies, and the implications of agentic collaboration on Conway's Law, as well as what's next for Compass and Nick's optimistic views on AI-accelerated software engineering.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData teams everywhere face the same problem: they're forcing ML models, streaming data, and real-time processing through orchestration tools built for simple ETL. The result? Inflexible infrastructure that can't adapt to different workloads. That's why Cash App and Cisco rely on Prefect. Cash App's fraud detection team got what they needed - flexible compute options, isolated environments for custom packages, and seamless data exchange between workflows. Each model runs on the right infrastructure, whether that's high-memory machines or distributed compute. Orchestration is the foundation that determines whether your data team ships or struggles. ETL, ML model training, AI Engineering, Streaming - Prefect runs it all from ingestion to activation in one platform. Whoop and 1Password also trust Prefect for their data operations. If these industry leaders use Prefect for critical workflows, see what it can do for you at dataengineeringpodcast.com/prefect.Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Nick Schrock about building an AI analyst that keeps data teams in the loopInterview IntroductionHow did you get involved in the area of data management?Can you describe what Compass is and the story behind it?context repository structurehow to keep it relevant/avoid sprawl/duplicationproviding guardrailshow does a tool like Compass help provide feedback/insights back to the data teams?preparing the data warehouse for effective introspection by the AILLM selectioncost managementcaching/materializing ad-hoc queriesWhy Slack and enterprise chat are important to b2b softwareHow AI is changing stakeholder relationshipsHow not to overpromise AI capabilities How does Compass relate to BI?How does Compass relate to Dagster and Data Infrastructure?What are the most interesting, innovative, or unexpected ways that you have seen Compass used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on Compass?When is Compass the wrong choice?What do you have planned for the future of Compass?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links DagsterDagster LabsDagster PlusDagster CompassChris Bergh DataOps EpisodeRise of Medium Code blog postContext EngineeringData StewardInformation ArchitectureConway's LawTemporal durable execution frameworkThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
There are very few people like Stephen Brobst, a legendary tech CTO and "certified data geek," Stephen shares his incredible journey, from his early days in computational physics and building real-time trading systems on Wall Street to becoming the CTO for Teradata and now Ab Initio Software. Stephen provides a masterclass on the evolution of data architecture, tracing the macro trends from early decision support systems to "active data warehousing" and the rise of AI/ML (formerly known as data mining). He dives deep into why metadata-driven architecture is critical for the future and how AI, large language models, and real-time sensor technology will fundamentally reshape industries and eliminate the dashboard as we know it. We also chat about something way cooler, as Stephen discusses his three passions: travel, music, and teaching. He reveals his personal rule of never staying in the same city for more than five consecutive days since 1993 and how he manages a life of constant motion. From his early days DJing punk rock and seeing the Sex Pistols' last concert to his minimalist travel philosophy and ever-growing bucket list, Stephen offers a unique perspective on living a life rich with experience over material possessions. Finally, he offers invaluable advice for the next generation on navigating careers in an AI-driven world and living life to the fullest.
Summary In this episode of the Data Engineering Podcast Serge Gershkovich, head of product at SQL DBM, talks about the socio-technical aspects of data modeling. Serge shares his background in data modeling and highlights its importance as a collaborative process between business stakeholders and data teams. He debunks common misconceptions that data modeling is optional or secondary, emphasizing its crucial role in ensuring alignment between business requirements and data structures. The conversation covers challenges in complex environments, the impact of technical decisions on data strategy, and the evolving role of AI in data management. Serge stresses the need for business stakeholders' involvement in data initiatives and a systematic approach to data modeling, warning against relying solely on technical expertise without considering business alignment.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Enterprises today face an enormous challenge: they’re investing billions into Snowflake and Databricks, but without strong foundations, those investments risk becoming fragmented, expensive, and hard to govern. And that’s especially evident in large, complex enterprise data environments. That’s why companies like DirecTV and Pfizer rely on SqlDBM. Data modeling may be one of the most traditional practices in IT, but it remains the backbone of enterprise data strategy. In today’s cloud era, that backbone needs a modern approach built natively for the cloud, with direct connections to the very platforms driving your business forward. Without strong modeling, data management becomes chaotic, analytics lose trust, and AI initiatives fail to scale. SqlDBM ensures enterprises don’t just move to the cloud—they maximize their ROI by creating governed, scalable, and business-aligned data environments. If global enterprises are using SqlDBM to tackle the biggest challenges in data management, analytics, and AI, isn’t it worth exploring what it can do for yours? Visit dataengineeringpodcast.com/sqldbm to learn more.Your host is Tobias Macey and today I'm interviewing Serge Gershkovich about how and why data modeling is a sociotechnical endeavorInterview IntroductionHow did you get involved in the area of data management?Can you start by describing the activities that you think of when someone says the term "data modeling"?What are the main groupings of incomplete or inaccurate definitions that you typically encounter in conversation on the topic?How do those conceptions of the problem lead to challenges and bottlenecks in execution?Data modeling is often associated with data warehouse design, but it also extends to source systems and unstructured/semi-structured assets. How does the inclusion of other data localities help in the overall success of a data/domain modeling effort?Another aspect of data modeling that often consumes a substantial amount of debate is which pattern to adhere to (star/snowflake, data vault, one big table, anchor modeling, etc.). What are some of the ways that you have found effective to remove that as a stumbling block when first developing an organizational domain representation?While the overall purpose of data modeling is to provide a digital representation of the business processes, there are inevitable technical decisions to be made. What are the most significant ways that the underlying technical systems can help or hinder the goals of building a digital twin of the business?What impact (positive and negative) are you seeing from the introduction of LLMs into the workflow of data modeling?How does tool use (e.g. MCP connection to warehouse/lakehouse) help when developing the transformation logic for achieving a given domain representation? What are the most interesting, innovative, or unexpected ways that you have seen organizations address the data modeling lifecycle?What are the most interesting, unexpected, or challenging lessons that you have learned while working with organizations implementing a data modeling effort?What are the overall trends in the ecosystem that you are monitoring related to data modeling practices?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Links sqlDBMSAPJoe ReisERD == Entity Relation DiagramMaster Data ManagementdbtData ContractsData Modeling With Snowflake book by Serge (affiliate link)Type 2 DimensionData VaultStar SchemaAnchor ModelingRalph KimballBill InmonSixth Normal FormMCP == Model Context ProtocolThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Summary In this episode of the Data Engineering Podcast, host Tobias Macy welcomes back Shinji Kim to discuss the evolving role of semantic layers in the era of AI. As they explore the challenges of managing vast data ecosystems and providing context to data users, they delve into the significance of semantic layers for AI applications. They dive into the nuances of semantic modeling, the impact of AI on data accessibility, and the importance of business logic in semantic models. Shinji shares her insights on how SelectStar is helping teams navigate these complexities, and together they cover the future of semantic modeling as a native construct in data systems. Join them for an in-depth conversation on the evolving landscape of data engineering and its intersection with AI.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm interviewing Shinji Kim about the role of semantic layers in the era of AIInterview IntroductionHow did you get involved in the area of data management?Semantic modeling gained a lot of attention ~4-5 years ago in the context of the "modern data stack". What is your motivation for revisiting that topic today?There are several overlapping concepts – "semantic layer," "metrics layer," "headless BI." How do you define these terms, and what are the key distinctions and overlaps?Do you see these concepts converging, or do they serve distinct long-term purposes?Data warehousing and business intelligence have been around for decades now. What new value does semantic modeling beyond practices like star schemas, OLAP cubes, etc.?What benefits does a semantic model provide when integrating your data platform into AI use cases?How is it different between using AI as an interface to your analytical use cases vs. powering customer facing AI applications with your data?Putting in the effort to create and maintain a set of semantic models is non-zero. What role can LLMs play in helping to propose and construct those models?For teams who have already invested in building this capability, what additional context and metadata is necessary to provide guidance to LLMs when working with their models?What's the most effective way to create a semantic layer without turning it into a massive project? There are several technologies available for building and serving these models. What are the selection criteria that you recommend for teams who are starting down this path?What are the most interesting, innovative, or unexpected ways that you have seen semantic models used?What are the most interesting, unexpected, or challenging lessons that you have learned while working with semantic modeling?When is semantic modeling the wrong choice?What do you predict for the future of semantic modeling?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links SelectStarSun MicrosystemsMarkov Chain Monte CarloSemantic ModelingSemantic LayerMetrics LayerHeadless BICubePodcast EpisodeAtScaleStar SchemaData VaultOLAP CubeRAG == Retrieval Augmented GenerationAI Engineering Podcast EpisodeKNN == K-Nearest NeighbersHNSW == Hierarchical Navigable Small Worlddbt Metrics LayerSoda DataLookMLHexPowerBITableauSemantic View (Snowflake)Databricks GenieSnowflake Cortex AnalystMalloyThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Summary In this episode of the Data Engineering Podcast Bartosz Mikulski talks about preparing data for AI applications. Bartosz shares his journey from data engineering to MLOps and emphasizes the importance of data testing over software development in AI contexts. He discusses the types of data assets required for AI applications, including extensive test datasets, especially in generative AI, and explains the differences in data requirements for various AI application styles. The conversation also explores the skills data engineers need to transition into AI, such as familiarity with vector databases and new data modeling strategies, and highlights the challenges of evolving AI applications, including frequent reprocessing of data when changing chunking strategies or embedding models.
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Bartosz Mikulski about how to prepare data for use in AI applicationsInterview IntroductionHow did you get involved in the area of data management?Can you start by outlining some of the main categories of data assets that are needed for AI applications?How does the nature of the application change those requirements? (e.g. RAG app vs. agent, etc.)How do the different assets map to the stages of the application lifecycle?What are some of the common roles and divisions of responsibility that you see in the construction and operation of a "typical" AI application?For data engineers who are used to data warehousing/BI, what are the skills that map to AI apps?What are some of the data modeling patterns that are needed to support AI apps?chunking strategies metadata managementWhat are the new categories of data that data engineers need to manage in the context of AI applications?agent memory generation/evolution conversation history managementdata collection for fine tuningWhat are some of the notable evolutions in the space of AI applications and their patterns that have happened in the past ~1-2 years that relate to the responsibilities of data engineers?What are some of the skills gaps that teams should be aware of and identify training opportunities for?What are the most interesting, innovative, or unexpected ways that you have seen data teams address the needs of AI applications?What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI applications and their reliance on data?What are some of the emerging trends that you are paying particular attention to?Contact Info WebsiteLinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links SparkRayChunking StrategiesHypothetical document embeddingsModel Fine TuningPrompt CompressionThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
As we look back at 2024, we're highlighting some of our favourite episodes of the year, and with 100 of them to choose from, it wasn't easy! The four guests we'll be recapping with are: Lea Pica - A celebrity in the data storytelling and visualisation space. Richie and Lea cover the full picture of data presentation, how to understand your audience, how to leverage hollywood storytelling and more. Out December 19.Alex Banks - Founder of Sunday Signal. Adel and Alex cover Alex’s journey into AI and what led him to create Sunday Signal, the potential of AI, prompt engineering at its most basic level, chain of thought prompting, the future of LLMs and more. Out December 23.Don Chamberlin - The renowned co-inventor of SQL. Richie and Don explore the early development of SQL, how it became standardized, the future of SQL through NoSQL and SQL++ and more. Out December 26.Tom Tunguz - general Partner at Theory Ventures, a $235m VC firm. Richie and Tom explore trends in generative AI, cloud+local hybrid workflows, data security, the future of business intelligence and data analytics, AI in the corporate sector and more. Out December 30. Rapid change seems to be the new norm within the data and AI space, and due to the ecosystem constantly changing, it can be tricky to keep up. Fortunately, any self-respecting venture capitalist looking into data and AI will stay on top of what’s changing and where the next big breakthroughs are likely to come from. We all want to know which important trends are emerging and how we can take advantage of them, so why not learn from a leading VC. Tomasz Tunguz is a General Partner at Theory Ventures, a $235m early-stage venture capital firm. He blogs sat tomtunguz.com & co-authored Winning with Data. He has worked or works with Looker, Kustomer, Monte Carlo, Dremio, Omni, Hex, Spot, Arbitrum, Sui & many others. He was previously the product manager for Google's social media monetization team, including the Google-MySpace partnership, and managed the launches of AdSense into six new markets in Europe and Asia. Before Google, Tunguz developed systems for the Department of Homeland Security at Appian Corporation. In the episode, Richie and Tom explore trends in generative AI, the impact of AI on professional fields, cloud+local hybrid workflows, data security, and changes in data warehousing through the use of integrated AI tools, the future of business intelligence and data analytics, the challenges and opportunities surrounding AI in the corporate sector. You'll also get to discover Tom's picks for the hottest new data startups. Links Mentioned in the Show: Tom’s BlogTheory VenturesArticle: What Air Canada Lost In ‘Remarkable’ Lying AI Chatbot Case[Course] Implementing AI Solutions in BusinessRelated Episode: Making Better Decisions using Data & AI with Cassie Kozyrkov, Google's First Chief Decision ScientistSign up to RADAR: AI...
Bill Inmon is considered the father of the data warehouse. I just got back from spending a couple of days with Bill, and we discussed the history of the data industry and the data warehouse. On my flight back, I realized people could benefit from a short version of our conversation.
In this short chat, we discuss what a data warehouse is (and is not), Kimball and Inmon, the origins of the data warehouse, and much more.
We’re improving DataFramed, and we need your help! We want to hear what you have to say about the show, and how we can make it more enjoyable for you—find out more here. Integrating generative AI with robust databases is becoming essential. As organizations face a plethora of database options and AI tools, making informed decisions is crucial for enhancing customer experiences and operational efficiency. How do you ensure your AI systems are powered by high-quality data? And how can these choices impact your organization's success? Gerrit Kazmaier is the VP and GM of Data Analytics at Google Cloud. Gerrit leads the development and design of Google Cloud’s data technology, which includes data warehousing and analytics. Gerrit’s mission is to build a unified data platform for all types of data processing as the foundation for the digital enterprise. Before joining Google, Gerrit served as President of the HANA & Analytics team at SAP in Germany and led the global Product, Solution & Engineering teams for Databases, Data Warehousing and Analytics. In 2015, Gerrit served as the Vice President of SAP Analytics Cloud in Vancouver, Canada. In this episode, Richie and Gerrit explore the transformative role of AI in data tools, the evolution of dashboards, the integration of AI with existing workflows, the challenges and opportunities in SQL code generation, the importance of a unified data platform, leveraging unstructured data, and much more. Links Mentioned in the Show: Google CloudConnect with GerritThinking Fast and Slow by Daniel KahnemanCourse: Introduction to GCPRelated Episode: Not Only Vector Databases: Putting Databases at the Heart of AI, with Andi Gutmans, VP and GM of Databases at GoogleRewatch sessions from RADAR: Forward Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business
Summary Gleb Mezhanskiy, CEO and co-founder of DataFold, joins Tobias Macey to discuss the challenges and innovations in data migrations. Gleb shares his experiences building and scaling data platforms at companies like Autodesk and Lyft, and how these experiences inspired the creation of DataFold to address data quality issues across teams. He outlines the complexities of data migrations, including common pitfalls such as technical debt and the importance of achieving parity between old and new systems. Gleb also discusses DataFold's innovative use of AI and large language models (LLMs) to automate translation and reconciliation processes in data migrations, reducing time and effort required for migrations. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementImagine catching data issues before they snowball into bigger problems. That’s what Datafold’s new Monitors do. With automatic monitoring for cross-database data diffs, schema changes, key metrics, and custom data tests, you can catch discrepancies and anomalies in real time, right at the source. Whether it’s maintaining data integrity or preventing costly mistakes, Datafold Monitors give you the visibility and control you need to keep your entire data stack running smoothly. Want to stop issues before they hit production? Learn more at dataengineeringpodcast.com/datafold today!Your host is Tobias Macey and today I'm welcoming back Gleb Mezhanskiy to talk about Datafold's experience bringing AI to bear on the problem of migrating your data stackInterview IntroductionHow did you get involved in the area of data management?Can you describe what the Data Migration Agent is and the story behind it?What is the core problem that you are targeting with the agent?What are the biggest time sinks in the process of database and tooling migration that teams run into?Can you describe the architecture of your agent?What was your selection and evaluation process for the LLM that you are using?What were some of the main unknowns that you had to discover going into the project?What are some of the evolutions in the ecosystem that occurred either during the development process or since your initial launch that have caused you to second-guess elements of the design?In terms of SQL translation there are libraries such as SQLGlot and the work being done with SDF that aim to address that through AST parsing and subsequent dialect generation. What are the ways that approach is insufficient in the context of a platform migration?How does the approach you are taking with the combination of data-diffing and automated translation help build confidence in the migration target?What are the most interesting, innovative, or unexpected ways that you have seen the Data Migration Agent used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on building an AI powered migration assistant?When is the data migration agent the wrong choice?What do you have planned for the future of applications of AI at Datafold?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links DatafoldDatafold Migration AgentDatafold data-diffDatafold Reconciliation Podcast EpisodeSQLGlotLark parserClaude 3.5 SonnetLookerPodcast EpisodeThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
The Data Product Management In Action podcast, brought to you by Soda and executive producer Scott Hirleman, is a platform for data product management practitioners to share insights and experiences. In Season 01, Episode 19, host Nadiem von Heydebrand interviews Pradeep Fernando, who leads the data and metadata management initiative at Swisscom. They explore key topics in data product management, including the definition and categorization of data products, the role of AI, prioritization strategies, and the application of product management principles. Pradeep shares valuable insights and experiences on successfully implementing data product management within organizations. About our host Nadiem von Heydebrand: Nadiem is the CEO and Co-Founder of Mindfuel. In 2019, he merged his passion for data science with product management, becoming a thought leader in data product management. Nadiem is dedicated to demonstrating the true value contribution of data. With over a decade of experience in the data industry, Nadiem leverages his expertise to scale data platforms, implement data mesh concepts, and transform AI performance into business performance, delighting consumers at global organizations that include Volkswagen, Munich Re, Allianz, Red Bull, and Vorwerk. Connect with Nadiem on LinkedIn. About our guest Pradeep Fernando: Pradeep is a seasoned data product leader with over 6 years of data product leadership experience and over 10 years of product management experience. He leads or is a key contributor to several company-wide data & analytics initiatives at Swisscom such as Data as a Product (Data Mesh), One Data Platform, Machine Learning (Factory), MetaData management, Self-service data & analytics, BI Tooling Strategy, Cloud Transformation, Big Data platforms,and Data warehousing. Previously, he was a product manager at both Swisscom's B2B and Innovation units both building new products and optimizing mature products (profitability) in the domains of enterprise mobile fleet management, cyber-and mobile device security.Pradeep is also passionate about and experienced in leading the development of data products and transforming IT delivery teams into empowered, agile product teams. And, he is always happy to engage in a conversation about lean product management or "heavier" topics such as humanity's future or our past. Connect with Pradeep on LinkedIn. All views and opinions expressed are those of the individuals and do not necessarily reflect their employers or anyone else. Join the conversation on LinkedIn. Apply to be a guest or nominate someone that you know. Do you love what you're listening to? Please rate and review the podcast, and share it with fellow practitioners you know. Your support helps us reach more listeners and continue providing valuable insights!
As organizations grapple with data spread across various storage locations, solutions like Coginiti Hybrid Query offer a much-needed alternative to fragmented tools. Published at: https://www.eckerson.com/articles/a-novel-approach-for-reducing-cloud-data-warehouse-expenses-from-coginiti
Summary In this episode of the Data Engineering Podcast, host Tobias Macey welcomes back Chris Berg, CEO of DataKitchen, to discuss his ongoing mission to simplify the lives of data engineers. Chris explains the challenges faced by data engineers, such as constant system failures, the need for rapid changes, and high customer demands. Chris delves into the concept of DataOps, its evolution, and the misappropriation of related terms like data mesh and data observability. He emphasizes the importance of focusing on processes and systems rather than just tools to improve data engineering workflows. Chris also introduces DataKitchen's open-source tools, DataOps TestGen and DataOps Observability, designed to automate data quality validation and monitor data journeys in production. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst is an end-to-end data lakehouse platform built on Trino, the query engine Apache Iceberg was designed for, with complete support for all table formats including Apache Iceberg, Hive, and Delta Lake. Trusted by teams of all sizes, including Comcast and Doordash. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino.Your host is Tobias Macey and today I'm interviewing Chris Bergh about his tireless quest to simplify the lives of data engineersInterview IntroductionHow did you get involved in the area of data management?Can you describe what DataKitchen is and the story behind it?You helped to define and popularize "DataOps", which then went through a journey of misappropriation similar to "DevOps", and has since faded in use. What is your view on the realities of "DataOps" today?Out of the popularized wave of "DataOps" tools came subsequent trends in data observability, data reliability engineering, etc. How have those cycles influenced the way that you think about the work that you are doing at DataKitchen?The data ecosystem went through a massive growth period over the past ~7 years, and we are now entering a cycle of consolidation. What are the fundamental shifts that we have gone through as an industry in the management and application of data?What are the challenges that never went away?You recently open sourced the dataops-testgen and dataops-observability tools. What are the outcomes that you are trying to produce with those projects?What are the areas of overlap with existing tools and what are the unique capabilities that you are offering?Can you talk through the technical implementation of your new obserability and quality testing platform?What does the onboarding and integration process look like?Once a team has one or both tools set up, what are the typical points of interaction that they will have over the course of their workday?What are the most interesting, innovative, or unexpected ways that you have seen dataops-observability/testgen used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on promoting DataOps?What do you have planned for the future of your work at DataKitchen?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Links DataKitchenPodcast EpisodeNASADataOps ManifestoData Reliability EngineeringData ObservabilitydbtDevOps Enterprise SummitBuilding The Data Warehouse by Bill Inmon (affiliate link)dataops-testgen, dataops-observabilityFree Data Quality and Data Observability CertificationDatabricksDORA MetricsDORA for dataThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA
Whether big or small, one of the biggest challenges organizations face when they want to work with data effectively is often lack of access to it. This is where building a data platform comes in. But building a data platform is no easy feat. It's not just about centralizing data in the data warehouse, it’s also about making sure that data is actionable, trustable and usable. So, how do you make sure your data platform is up to par? Shuang Li is Group Product Manager at Box. With experience of building data, analytics, ML, and observability platform products for both external and internal customers, Shuang is always passionate about the insights, optimizations, and predictions that big data and AI/ML make possible. Throughout her career, she transitioned from academia to engineering, from engineering to product management, and then from an individual contributor to an emerging product executive. In the episode, Adel and Shuang explore her career journey, including transitioning from academia to engineering and helping to work on Google Fiber, how to build a data platform, ingestion pipelines, processing pipelines, challenges and milestones in building a data platform, data observability and quality, developer experience, data democratization, future trends and a lot more. Links Mentioned in the Show: BoxConnect with Shuang on Linkedin[Course] Understanding Modern Data ArchitectureRelated Episode: Scaling Enterprise Analytics with Libby Duane Adams, Chief Advocacy Officer and Co-Founder of Alteryx New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business
In the fast-paced work environments we are used to, the ability to quickly find and understand data is essential. Data professionals can often spend more time searching for data than analyzing it, which can hinder business progress. Innovations like data catalogs and automated lineage systems are transforming data management, making it easier to ensure data quality, trust, and compliance. By creating a strong metadata foundation and integrating these tools into existing workflows, organizations can enhance decision-making and operational efficiency. But how did this all come to be, who is driving better access and collaboration through data? Prukalpa Sankar is the Co-founder of Atlan. Atlan is a modern data collaboration workspace (like GitHub for engineering or Figma for design). By acting as a virtual hub for data assets ranging from tables and dashboards to models & code, Atlan enables teams to create a single source of truth for all their data assets, and collaborate across the modern data stack through deep integrations with tools like Slack, BI tools, data science tools and more. A pioneer in the space, Atlan was recognized by Gartner as a Cool Vendor in DataOps, as one of the top 3 companies globally. Prukalpa previously co-founded SocialCops, world leading data for good company (New York Times Global Visionary, World Economic Forum Tech Pioneer). SocialCops is behind landmark data projects including India’s National Data Platform and SDGs global monitoring in collaboration with the United Nations. She was awarded Economic Times Emerging Entrepreneur for the Year, Forbes 30u30, Fortune 40u40, Top 10 CNBC Young Business Women 2016, and a TED Speaker. In the episode, Richie and Prukalpa explore challenges within data discoverability, the inception of Atlan, the importance of a data catalog, personalization in data catalogs, data lineage, building data lineage, implementing data governance, human collaboration in data governance, skills for effective data governance, product design for diverse audiences, regulatory compliance, the future of data management and much more. Links Mentioned in the Show: AtlanConnect with Prukalpa[Course] Artificial Intelligence (AI) StrategyRelated Episode: Adding AI to the Data Warehouse with Sridhar Ramaswamy, CEO at SnowflakeSign up to RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business
In today's fast-paced digital world, managing IT operations is more complex than ever. With the rise of cloud services, microservices, and constant software deployments, the pressure on IT teams to keep everything running smoothly is immense. But how do you keep up with the ever-growing flood of data and ensure your systems are always available? AIOps is the use of artificial intelligence to automate and scale IT operations. But what exactly is AIOps, and how can it transform your IT operations? Assaf Resnick is the CEO and Co-Founder of BigPanda. Before founding BigPanda, Assaf was an investor at Sequoia Capital, where he focused on early and growth-stage investing in software, internet, and mobile sectors. Assaf’s time at Sequoia gave him a front-row seat to the challenges of IT scale, complexity, and velocity faced by Operations teams in rapidly scaling and accelerating organizations. This is the problem that Assaf founded BigPanda to solve. In the episode, Richie and Assaf explore AIOps, how AIOps helps manage increasingly complex IT operations, how AIOps differs from DevOps and MLOps, examples of AIOps projects, a real world application of AIOps, the key benefits of AIOps, how to implement AIOps, excitement in the space, how GenAI is improving AIOps and much more. Links Mentioned in the Show: BigPandaGartner: Market Guide for AIOps Platforms[Course] Implementing AI Solutions in BusinessRelated Episode: Adding AI to the Data Warehouse with Sridhar Ramaswamy, CEO at SnowflakeSign up to RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile app Empower your business with world-class data and AI skills with DataCamp for business
Rapid change seems to be the new norm within the data and AI space, and due to the ecosystem constantly changing, it can be tricky to keep up. Fortunately, any self-respecting venture capitalist looking into data and AI will stay on top of what’s changing and where the next big breakthroughs are likely to come from. We all want to know which important trends are emerging and how we can take advantage of them, so why not learn from a leading VC. Tomasz Tunguz is a General Partner at Theory Ventures, a $235m early-stage venture capital firm. He blogs sat tomtunguz.com & co-authored Winning with Data. He has worked or works with Looker, Kustomer, Monte Carlo, Dremio, Omni, Hex, Spot, Arbitrum, Sui & many others. He was previously the product manager for Google's social media monetization team, including the Google-MySpace partnership, and managed the launches of AdSense into six new markets in Europe and Asia. Before Google, Tunguz developed systems for the Department of Homeland Security at Appian Corporation. In the episode, Richie and Tom explore trends in generative AI, the impact of AI on professional fields, cloud+local hybrid workflows, data security, and changes in data warehousing through the use of integrated AI tools, the future of business intelligence and data analytics, the challenges and opportunities surrounding AI in the corporate sector. You'll also get to discover Tom's picks for the hottest new data startups. Links Mentioned in the Show: Tom’s BlogTheory VenturesArticle: What Air Canada Lost In ‘Remarkable’ Lying AI Chatbot Case[Course] Implementing AI Solutions in BusinessRelated Episode: Making Better Decisions using Data & AI with Cassie Kozyrkov, Google's First Chief Decision ScientistSign up to RADAR: AI Edition New to DataCamp? Learn on the go using the DataCamp mobile appEmpower your business with world-class data and AI skills with DataCamp for business
We’ve heard so much about the value and capabilities of generative AI over the past year, and we’ve all become accustomed to the chat interfaces of our preferred models. One of the main concerns many of us have had has been privacy. Is OpenAI keeping the data and information I give to ChatGPT secure? One of the touted solutions to this problem is running LLMs locally on your own machine, but with the hardware cost that comes with it, running LLMs locally has not been possible for many of us. That might now be starting to change. Nuri Canyaka is VP of AI Marketing at Intel. Prior to Intel, Nuri spent 16 years at Microsoft, starting out as a Technical Evangelist, and leaving the organization as the Senior Director of Product Marketing. He ran the GTM team that helped generate adoption of GPT in Microsoft Azure products. La Tiffaney Santucci is Intel’s AI Marketing Director, specializing in their Edge and Client products. La Tiffaney has spent over a decade at Intel, focussing on partnerships with Dell, Google Amazon and Microsoft. In the episode, Richie, Nuri and La Tiffaney explore AI’s impact on marketing analytics, the adoptions of AI in the enterprise, how AI is being integrated into existing products, the workflow for implementing AI into business processes and the challenges that come with it, the importance of edge AI for instant decision-making in uses-cases like self-driving cars, the emergence of AI engineering as a distinct field of work, the democratization of AI, what the state of AGI might look like in the near future and much more. About the AI and the Modern Data Stack DataFramed Series This week we’re releasing 4 episodes focused on how AI is changing the modern data stack and the analytics profession at large. The modern data stack is often an ambiguous and all-encompassing term, so we intentionally wanted to cover the impact of AI on the modern data stack from different angles. Here’s what you can expect: Why the Future of AI in Data will be Weird with Benn Stancil, CTO at Mode & Field CTO at ThoughtSpot — Covering how AI will change analytics workflows and tools How Databricks is Transforming Data Warehousing and AI with Ari Kaplan, Head Evangelist & Robin Sutara, Field CTO at Databricks — Covering Databricks, data intelligence and how AI tools are changing data democratizationAdding AI to the Data Warehouse with Sridhar Ramaswamy, CEO at Snowflake — Covering Snowflake and its uses, how generative AI is changing the attitudes of leaders towards data, and how to improve your data managementAccelerating AI Workflows with Nuri Cankaya, VP of AI Marketing & La Tiffaney Santucci, AI Marketing Director at Intel — Covering AI’s impact on marketing analytics, how AI is being integrated into existing products, and the democratization of AI Links Mentioned in the Show: Intel OpenVINO™ toolkitIntel Developer Clouds for Accelerated ComputingAWS Re:Invent[Course] Implementing AI Solutions in BusinessRelated Episode: Intel CTO Steve Orrin on How Governments Can Navigate the Data & AI RevolutionSign up to a href="https://www.datacamp.com/radar-analytics-edition"...
Snowflake has been foundational in the data space for years. In the mid-2010s, the platform was a major driver of moving data to the cloud. More recently, it's become apparent that combining data and AI in the cloud is key to accelerating innovation. Snowflake has been rapidly adding AI features to provide value to the modern data stack, but what’s really been going on under the hood? At the time of recording, Sridhar Ramaswamy was the SVP of AI at Snowflake, being appointed CEO at Snowflake in February 2024. Sridhar was formerly Co-Founder of Neeva, acquired in 2023 by Snowflake. Before founding Neeva, Ramaswamy oversaw Google's advertising products, including search, display, video advertising, analytics, shopping, payments, and travel. He joined Google in 2003 and was part of the growth of AdWords and Google's overall advertising business. He spent more than 15 years at Google, where he started as a software engineer and rose to SVP of Ads & Commerce. In the episode, Richie and Sridhar explore Snowflake and its uses, how generative AI is changing the attitudes of leaders towards data, how NLP and AI have impacted enterprise business operations as well as new applications of AI in an enterprise environment, the challenges of enterprise search, the importance of data quality, management and the role of semantic layers in the effective use of AI, a look into Snowflakes products including Snowpilot and Cortex, the collaboration required for successful data and AI projects, advice for organizations looking to improve their data management and much more. About the AI and the Modern Data Stack DataFramed Series This week we’re releasing 4 episodes focused on how AI is changing the modern data stack and the analytics profession at large. The modern data stack is often an ambiguous and all-encompassing term, so we intentionally wanted to cover the impact of AI on the modern data stack from different angles. Here’s what you can expect: Why the Future of AI in Data will be Weird with Benn Stancil, CTO at Mode & Field CTO at ThoughtSpot — Covering how AI will change analytics workflows and tools How Databricks is Transforming Data Warehousing and AI with Ari Kaplan, Head Evangelist & Robin Sutara, Field CTO at Databricks — Covering Databricks, data intelligence and how AI tools are changing data democratizationAdding AI to the Data Warehouse with Sridhar Ramaswamy, CEO at Snowflake — Covering Snowflake and its uses, how generative AI is changing the attitudes of leaders towards data, and how to improve your data managementAccelerating AI Workflows with Nuri Cankaya, VP of AI Marketing & La Tiffaney Santucci, AI Marketing Director at Intel — Covering AI’s impact on marketing analytics, how AI is being integrated into existing products, and the democratization of AI Links Mentioned in the Show: SnowflakeSnowflake acquires Neeva to accelerate search in the Data Cloud through generative AIUse AI in Seconds with Snowflake Cortex[Course] Introduction to SnowflakeRelated Episode: Why AI will Change Everything—with Former Snowflake CEO, Bob MugliaSign up to a...
Databricks started out as a platform for using Spark, a big data analytics engine, but it's grown a lot since then. Databricks now allows users to leverage their data and AI projects in the same place, ensuring ease of use and consistency across operations. The Databricks platform is converging on the idea of data intelligence, but what does this mean, how will it help data teams and organizations, and where does AI fit in the picture? Ari is Databricks’ Head of Evangelism and "The Real Moneyball Guy" - the popular movie was partly based on his analytical innovations in Major League Baseball. He is a leading influencer in analytics, artificial intelligence, data science, and high-growth business innovation. Ari was previously the Global AI Evangelist at DataRobot, Nielsen’s regional VP of Analytics, Caltech Alumni of the Decade, President Emeritus of the worldwide Independent Oracle Users Group, on Intel’s AI Board of Advisors, Sports Illustrated Top Ten GM Candidate, an IBM Watson Celebrity Data Scientist, and on the Crain’s Chicago 40 Under 40. He's also written 5 books on analytics, databases, and baseball. Robin is the Field CTO at Databricks. She has consulted with hundreds of organizations on data strategy, data culture, and building diverse data teams. Robin has had an eclectic career path in technical and business functions with more than two decades in tech companies, including Microsoft and Databricks. She also has achieved multiple academic accomplishments from her juris doctorate to a masters in law to engineering leadership. From her first technical role as an entry-level consumer support engineer to her current role in the C-Suite, Robin supports creating an inclusive workplace and is the current co-chair of Women in Data Safety Committee. She was also recognized in 2023 as a Top 20 Women in Data and Tech, as well as DataIQ 100 Most Influential People in Data. In the episode, Richie, Ari, and Robin explore Databricks, the application of generative AI in improving services operations and providing data insights, data intelligence, and lakehouse technology, the wide-ranging applications of generative AI, how AI tools are changing data democratization, the challenges of data governance and management and how tools like Databricks can help, how jobs in data and AI are changing and much more. About the AI and the Modern Data Stack DataFramed Series This week we’re releasing 4 episodes focused on how AI is changing the modern data stack and the analytics profession at large. The modern data stack is often an ambiguous and all-encompassing term, so we intentionally wanted to cover the impact of AI on the modern data stack from different angles. Here’s what you can expect: Why the Future of AI in Data will be Weird with Benn Stancil, CTO at Mode & Field CTO at ThoughtSpot — Covering how AI will change analytics workflows and tools How Databricks is Transforming Data Warehousing and AI with Ari Kaplan, Head Evangelist & Robin Sutara, Field CTO at Databricks — Covering Databricks, data intelligence and how AI tools are changing data democratizationAdding AI to the Data Warehouse with Sridhar Ramaswamy, CEO at Snowflake — Covering Snowflake and its uses, how generative AI is changing the attitudes of leaders towards data, and how to improve your data managementAccelerating AI Workflows with Nuri Cankaya, VP of AI Marketing & La Tiffaney Santucci, AI Marketing Director at Intel — Covering AI’s impact on marketing analytics, how AI is being integrated into existing products, and the democratization of AI Links Mentioned in the Show: DatabricksDelta Lakea href="https://mlflow.org/" rel="noopener...