talk-data.com talk-data.com

Topic

Dashboard

data_visualization reporting bi

306

tagged

Activity Trend

23 peak/qtr
2020-Q1 2026-Q1

Activities

306 activities · Newest first

There are very few people like Stephen Brobst, a legendary tech CTO and "certified data geek," Stephen shares his incredible journey, from his early days in computational physics and building real-time trading systems on Wall Street to becoming the CTO for Teradata and now Ab Initio Software. Stephen provides a masterclass on the evolution of data architecture, tracing the macro trends from early decision support systems to "active data warehousing" and the rise of AI/ML (formerly known as data mining). He dives deep into why metadata-driven architecture is critical for the future and how AI, large language models, and real-time sensor technology will fundamentally reshape industries and eliminate the dashboard as we know it. We also chat about something way cooler, as Stephen discusses his three passions: travel, music, and teaching. He reveals his personal rule of never staying in the same city for more than five consecutive days since 1993 and how he manages a life of constant motion. From his early days DJing punk rock and seeing the Sex Pistols' last concert to his minimalist travel philosophy and ever-growing bucket list, Stephen offers a unique perspective on living a life rich with experience over material possessions. Finally, he offers invaluable advice for the next generation on navigating careers in an AI-driven world and living life to the fullest.

Here are 5 exciting and unique data analyst projects that will build your skills and impress hiring managers! These range from beginner to advanced and are designed to enhance your data storytelling abilities. ✨ Try Julius today at https://landadatajob.com/Julius-YT Where I Go To Find Datasets (as a data analyst) 👉 https://youtu.be/DHfuvMyBofE?si=ABsdUfzgG7Nsbl89 💌 Join 10k+ aspiring data analysts & get my tips in your inbox weekly 👉 https://www.datacareerjumpstart.com/newsletter 🆘 Feeling stuck in your data journey? Come to my next free "How to Land Your First Data Job" training 👉 https://www.datacareerjumpstart.com/training 👩‍💻 Want to land a data job in less than 90 days? 👉 https://www.datacareerjumpstart.com/daa 👔 Ace The Interview with Confidence 👉 https://www.datacareerjumpstart.com/interviewsimulator

⌚ TIMESTAMPS 00:00 - Introduction 00:24 - Project 1: Stock Price Analysis 03:46 - Project 2: Real Estate Data Analysis (SQL) 07:52 - Project 3: Personal Finance Dashboard (Tableau or Power BI) 11:20 - Project 4: Pokemon Analysis (Python) 14:16 - Project 5: Football Data Analysis (any tool)

🔗 CONNECT WITH AVERY 🎥 YouTube Channel: https://www.youtube.com/@averysmith 🤝 LinkedIn: https://www.linkedin.com/in/averyjsmith/ 📸 Instagram: https://instagram.com/datacareerjumpstart 🎵 TikTok: https://www.tiktok.com/@verydata 💻 Website: https://www.datacareerjumpstart.com/ Mentioned in this episode: Join the last cohort of 2025! The LAST cohort of The Data Analytics Accelerator for 2025 kicks off on Monday, December 8th and enrollment is officially open!

To celebrate the end of the year, we’re running a special End-of-Year Sale, where you’ll get: ✅ A discount on your enrollment 🎁 6 bonus gifts, including job listings, interview prep, AI tools + more

If your goal is to land a data job in 2026, this is your chance to get ahead of the competition and start strong.

👉 Join the December Cohort & Claim Your Bonuses: https://DataCareerJumpstart.com/daa https://www.datacareerjumpstart.com/daa

Struggling with data trust issues, dashboard drama, or constant pipeline firefighting? In this deep‑dive interview, Lior Barak shows you how to shift from a reactive “fix‑it” culture to a mindful, impact‑driven practice rooted in Zen/Wabi‑Sabi principles. You’ll learn: Why 97 % of CEOs say they use data, but only 24 % call themselves data‑driven The traffic‑light dashboard pattern (green / yellow / red) that instantly tells execs whether numbers are safe to use A practical rule for balancing maintenance, rollout, and innovation—and avoiding team burnout How to quantify ROI on data products, kill failing legacy systems, and handle ad‑hoc exec requests without derailing roadmaps Turning “imperfect” data into business value with mindful communication, root‑cause logs, and automated incident review loops

🕒 TIMECODES 00:00 Community and mindful data strategy 04:06 Career journey and product management insights 08:03 Wabi-sabi data and the trust crisis 11:47 AI, data imperfection, and trust challenges 20:05 Trust crisis examples and root cause analysis 25:06 Regaining trust through mindful data management 30:47 Traffic light system and effective communication 37:41 Communication gaps and team workload balance 39:58 Maintenance stress and embracing Zen mindset 49:29 Accepting imperfection and measuring impact 56:19 Legacy systems and managing executive requests 01:00:23 Role guidance and closing reflections

🔗 Connect with Lior LinkedIn - https://www.linkedin.com/in/liorbarak Website - https://cookingdata.substack.com/ Cooking Data newsletter: https://cookingdata.substack.com/ Product product lifecycle manager: https://app--data-product-lifecycle-manager-c81b10bb.base44.app/

🔗 Connect with DataTalks.Club Join the community - https://datatalks.club/slack.html Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/u/0/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ Check other upcoming events - https://lu.ma/dtc-events GitHub: https://github.com/DataTalksClub LinkedIn - https://www.linkedin.com/company/datatalks-club/ Twitter - https://x.com/DataTalksClub Website - https://datatalks.club/

🔗 Connect with Alexey Twitter - https://x.com/Al_Grigor Linkedin - https://www.linkedin.com/in/agrigorev/

--- How can data storytelling improve health outcomes and save lives? The Lung Health Dashboard offers one example.

--- Today’s episode explores how effective data storytelling connects the public with life-saving research, using the Lung Health Dashboard as an example. This project a collaboration between GovEx and the Johns Hopkins BREATHE Center, which promotes the science and medicine of lung health by interfacing with the community. The dashboard seeks to overcome the challenges of data communication through dynamic, “scrollytelling” visuals to provide research findings to viewers in a relatable perspective.

--- We’re joined by Meredith McCormack, Director of the Pulmonary & Critical Care Medicine Division of Johns Hopkins Medicine and Director of the BREATHE Center; Kirsten Koehler, a professor in the Department of Environmental Health and Engineering at the Bloomberg School of Public Health and Deputy Director of the BREATHE Center; and Mary Conway Vaughan, Deputy Director of Research and Analytics here at GovEx.

--- Learn more about the BREATHE Center --- View the Lung Health Dashboard --- Learn more about GovEx --- Fill out our listener survey

Todd Olson joins me to talk about making analytics worth paying for and relevant in the age of AI. The CEO of Pendo, an analytics SAAS company, Todd shares how the company evolved to support a wider audience by simplifying dashboards, removing user roadblocks, and leveraging AI to both generate and explain insights. We also talked about the roles of product management at Pendo. Todd views AI product management as a natural evolution for adaptable teams and explains how he thinks about hiring product roles in 2025. Todd also shares how he thinks about successful user adoption of his product around “time to value” and “stickiness” over vanity metrics like time spent. 

Highlights/ Skip to:

How Todd has addressed analytics apathy over the past decade at Pendo (1:17) Getting back to basics and not barraging people with more data and power (4:02) Pendo’s strategy for keeping the product experience simple without abandoning power users (6:44) Whether Todd is considering using an LLM (prompt-based) answer-driven experience with Pendo's UI (8:51) What Pendo looks for when hiring product managers right now, and why (14:58) How Pendo evaluates AI product managers, specifically (19:14) How Todd Olson views AI product management compared to traditional software product management (21:56) Todd’s concerns about the probabilistic nature of AI-generated answers in the product UX (27:51) What KPIs Todd uses to know whether Pendo is doing enough to reach its goals (32:49)   Why being able to tell what answers are best will become more important as choice increases (40:05)

Quotes from Today’s Episode

“Let’s go back to classic Geoffrey Moore Crossing the Chasm, you’re selling to early adopters. And what you’re doing is you’re relying on the early adopters’ skill set and figuring out how to take this data and connect it to business problems. So, in the early days, we didn’t do anything because the market we were selling to was very, very savvy; they’re hungry people, they just like new things. They’re getting data, they’re feeling really, really smart, everything’s working great. As you get bigger and bigger and bigger, you start to try to sell to a bigger TAM, a bigger audience, you start trying to talk to the these early majorities, which are, they’re not early adopters, they’re more technology laggards in some degree, and they don’t understand how to use data to inform their job. They’ve never used data to inform their job. There, we’ve had to do a lot more work.” Todd (2:04 - 2:58) “I think AI is amazing, and I don’t want to say AI is overhyped because AI in general is—yeah, it’s the revolution that we all have to pay attention to. Do I think that the skills necessary to be an AI product manager are so distinct that you need to hire differently? No, I don’t. That’s not what I’m seeing. If you have a really curious product manager who’s going all in, I think you’re going to be okay. Some of the most AI-forward work happening at Pendo is not just product management. Our design team is going crazy. And I think one of the things that we’re seeing is a blend between design and product, that they’re always adjacent and connected; there’s more sort of overlappiness now.” Todd (22:41 - 23:28) “I think about things like stickiness, which may not be an aggregate time, but how often are people coming back and checking in? And if you had this companion or this agent that you just could not live without, and it caused you to come into the product almost every day just to check in, but it’s a fast check-in, like, a five-minute check-in, a ten-minute check-in, that’s pretty darn sticky. That’s a good metric. So, I like stickiness as a metric because it’s measuring [things like], “Are you thinking about this product a lot?” And if you’re thinking about it a lot, and like, you can’t kind of live without it, you’re going to go to it a lot, even if it’s only a few minutes a day. Social media is like that. Thankfully I’m not addicted to TikTok or Instagram or anything like that, but I probably check it nearly every day. That’s a pretty good metric. It gets part of my process of any products that you’re checking every day is pretty darn good. So yeah, but I think we need to reframe the conversation not just total time. Like, how are we measuring outcomes and value, and I think that’s what’s ultimately going to win here.” Todd (39:57)

Links

LinkedIn: https://www.linkedin.com/in/toddaolson/  X: https://x.com/tolson  [email protected] 

podcast_episode
by Rose Weeks (Johns Hopkins Bloomberg School of Public Health) , Heather Bree (GovEx) , Debi Denney (Johns Hopkins Office of Climate & Sustainability) , Sara Betran de Lis (GovEx)

--- According to the U.S. Environmental Protection Agency, transportation accounts for 28% of U.S. greenhouse gas emissions. For short trips, flying is much more carbon-intensive than rail or bus travel. At Johns Hopkins, faculty members travel the most of all affiliate types, producing more than double the emissions of administrative employees and staff.

--- The Johns Hopkins University Office of Climate and Sustainability, through its Campus as a Living Lab initiative - a program that supports sustainability innovation - partnered with GovEx to build a tool to help address this problem. Using interactive visualizations with comparable statistics across all Johns Hopkins divisions, users can compare the emissions data of different methods of transportation, enabling them to make more environmentally-friendly choices as they conduct their business.

--- We sit down with four contributors to the project to discuss how the tool was built and how cities can use it as a model to support their own climate change initiatives: Sara Betran de Lis, Director of Research and Analytics at GovEx; Heather Bree, Data Visualization and D3 Developer at GovEx; Debi Denney, Assistant Director of Johns Hopkins Office of Climate & Sustainability; and Rose Weeks, Senior Research Associate at Johns Hopkins Bloomberg School of Public Health, working with the Campus as a Living Lab Program at the Office of Climate & Sustainability.

--- Learn more about GovEx --- Fill out our listener survey!

In today’s fast-paced business world, timely and reliable insights are crucial — but manual BI workflows can’t keep up. This session offers a practical guide to automating business intelligence processes using Apache Airflow. We’ll walk through real-world examples of automating data extraction, transformation, dashboard refreshes, and report distribution. Learn how to design DAGs that align with business SLAs, trigger workflows based on events, integrate with popular BI tools like Tableau and Power BI, and implement alerting and failure recovery mechanisms. Whether you’re new to Airflow or looking to scale your BI operations, this session will equip you with actionable strategies to save time, reduce errors, and supercharge your organization’s decision-making capabilities.

Summary In this episode of the Data Engineering Podcast we welcome back Nick Schrock, CTO and founder of Dagster Labs, to discuss the evolving landscape of data engineering in the age of AI. As AI begins to impact data platforms and the role of data engineers, Nick shares his insights on how it will ultimately enhance productivity and expand software engineering's scope. He delves into the current state of AI adoption, the importance of maintaining core data engineering principles, and the need for human oversight when leveraging AI tools effectively. Nick also introduces Dagster's new components feature, designed to modularize and standardize data transformation processes, making it easier for teams to collaborate and integrate AI into their workflows. Join in to explore the future of data engineering, the potential for AI to abstract away complexity, and the importance of open standards in preventing walled gardens in the tech industry.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementThis episode is brought to you by Coresignal, your go-to source for high-quality public web data to power best-in-class AI products. Instead of spending time collecting, cleaning, and enriching data in-house, use ready-made multi-source B2B data that can be smoothly integrated into your systems via APIs or as datasets. With over 3 billion data records from 15+ online sources, Coresignal delivers high-quality data on companies, employees, and jobs. It is powering decision-making for more than 700 companies across AI, investment, HR tech, sales tech, and market intelligence industries. A founding member of the Ethical Web Data Collection Initiative, Coresignal stands out not only for its data quality but also for its commitment to responsible data collection practices. Recognized as the top data provider by Datarade for two consecutive years, Coresignal is the go-to partner for those who need fresh, accurate, and ethically sourced B2B data at scale. Discover how Coresignal's data can enhance your AI platforms. Visit dataengineeringpodcast.com/coresignal to start your free 14-day trial. Data migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. This is a pharmaceutical Ad for Soda Data Quality. Do you suffer from chronic dashboard distrust? Are broken pipelines and silent schema changes wreaking havoc on your analytics? You may be experiencing symptoms of Undiagnosed Data Quality Syndrome — also known as UDQS. Ask your data team about Soda. With Soda Metrics Observability, you can track the health of your KPIs and metrics across the business — automatically detecting anomalies before your CEO does. It’s 70% more accurate than industry benchmarks, and the fastest in the category, analyzing 1.1 billion rows in just 64 seconds. And with Collaborative Data Contracts, engineers and business can finally agree on what “done” looks like — so you can stop fighting over column names, and start trusting your data again.Whether you’re a data engineer, analytics lead, or just someone who cries when a dashboard flatlines, Soda may be right for you. Side effects of implementing Soda may include: Increased trust in your metrics, reduced late-night Slack emergencies, spontaneous high-fives across departments, fewer meetings and less back-and-forth with business stakeholders, and in rare cases, a newfound love of data. Sign up today to get a chance to win a $1000+ custom mechanical keyboard. Visit dataengineeringpodcast.com/soda to sign up and follow Soda’s launch week. It starts June 9th.Your host is Tobias Macey and today I'm interviewing Nick Schrock about lowering the barrier to entry for data platform consumersInterview IntroductionHow did you get involved in the area of data management?Can you start by giving your summary of the impact that the tidal wave of AI has had on data platforms and data teams?For anyone who hasn't heard of Dagster, can you give a quick summary of the project?What are the notable changes in the Dagster project in the past year?What are the ecosystem pressures that have shaped the ways that you think about the features and trajectory of Dagster as a project/product/community?In your recent release you introduced "components", which is a substantial change in how you enable teams to collaborate on data problems. What was the motivating factor in that work and how does it change the ways that organizations engage with their data?tension between being flexible and extensible vs. opinionated and constrainedincreased dependency on orchestration with LLM use casesreducing the barrier to contribution for data platform/pipelinesbringing application engineers into the mixchallenges of meeting users/teams where they are (languages, platform investments, etc.)What are the most interesting, innovative, or unexpected ways that you have seen teams applying the Components pattern?What are the most interesting, unexpected, or challenging lessons that you have learned while working on the latest iterations of Dagster?When is Dagster the wrong choice?What do you have planned for the future of Dagster?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Links Dagster+ EpisodeDagster Components Slide DeckThe Rise Of Medium CodeLakehouse ArchitectureIcebergDagster ComponentsPydantic ModelsKubernetesDagster PipesRuby on RailsdbtSlingFivetranTemporalMCP == Model Context ProtocolThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

D&A value is not possible without data storytelling that offers a better way to engage communication findings than just BI reporting or data science notebooks. Join this session to know about the fundamentals of data storytelling and how to fill the gap between data science speakers and decision makers. It further discusses how to tell the best data storytelling and how to upscale data storytelling for future in landscape of GenAI.

Measure What Matters: Quality-Focused Monitoring for Production AI Agents

Ensuring the operational excellence of AI agents in production requires robust monitoring capabilities that span both performance metrics and quality evaluation. This session explores Databricks' comprehensive Mosaic Agent Monitoring solution, designed to provide visibility into deployed AI agents through an intuitive dashboard that tracks critical operational metrics and quality indicators. We'll demonstrate how to use the Agent Monitoring solution to iteratively improve a production agent that delivers a better customer support experience while decreasing the cost of delivering customer support. We will show how to: Identify and proactively fix a quality problem with the GenAI agent’s response before it becomes a major issue. Understand user’s usage patterns and implement/test an feature improvement to the GenAI agent Key session takeaways include: Techniques for monitoring essential operational metrics, including request volume, latency, errors, and cost efficiency across your AI agent deployments Strategies for implementing continuous quality evaluation using AI judges that assess correctness, guideline adherence, and safety without requiring ground truth labels Best practices for setting up effective monitoring dashboards that enable dimension-based analysis across time periods, user feedback, and topic categories Methods for collecting and integrating end-user feedback to create a closed-loop system that drives iterative improvement of your AI agents

Sponsored by: Datafold | Breaking Free: How Evri is Modernizing SAP HANA Workflows to Databricks with AI and Datafold

With expensive contracts up for renewal, Evri faced the challenge of migrating 1,000 SAP HANA assets and 200+ Talend jobs to Databricks. This talk will cover how we transformed SAP HANA and Talend workflows into modern Databricks pipelines through AI-powered translation and validation -- without months of manual coding. We'll cover:- Techniques for handling SAP HANA's proprietary formats- Approaches for refactoring incremental pipelines while ensuring dashboard stability- The technology enabling automated translation of complex business logic- Validation strategies that guarantee migration accuracye'll share real examples of SAP HANA stored procedures transformed into Databricks code and demonstrate how we maintained 100% uptime of critical dashboards during the transition. Join us to discover how AI is revolutionizing what's possible in enterprise migrations from GUI-based legacy systems to modern, code-first data platforms.

Sponsored by: DataNimbus | Building an AI Platform in 30 Days and Shaping the Future with Databricks

Join us as we dive into how Turnpoint Services, in collaboration with DataNimbus, built an Intelligence Platform on Databricks in just 30 days. We'll explore features like MLflow, LLMs, MLOps, Model Registry, Unity Catalog & Dashboard Alerts that powered AI applications such as Demand Forecasting, Customer 360 & Review Automation. Turnpoint’s transformation enabled data-driven decisions, ops efficiency & a better customer experience. Building a modern data foundation on Databricks optimizes resource allocation & drives engagement. We’ll also introduce innovations in DataNimbus Designer: AI Blocks: modular, prompt-driven smart transformers for text data, built visually & deployed directly within Databricks. These capabilities push the boundaries of what's possible on the Databricks platform. Attendees will gain practical insights, whether you're beginning your AI journey or looking to accelerate it.

Streamline Your BI Infrastructure With Databricks AI/BI and Save Millions on Traditional BI Tools

Earlier this year, we finished migration of all dashboards from a traditional BI system to Databricks AI/BI ecosystem, resulting in annual savings of approximately $900,000. We also unlocked the below advantages: Data security, integrity and safety Cost savings Single source of truth Real-time data Genie space We will speak about our journey and how you can migrate your dashboards from traditional BI to AI/BI. Having listed the advantages above, we will also speak of some challenges faced. Migration steps: Analytical scope of dashboard inventory Feature mapping: From traditional BI to AI/BI Building bronze, silver and gold tables Building dashboards Migration shenanigans: Hypercare phase Change management KT documents Demo sessions Deprecation of licenses and dashboards on traditional BI tools We look forward to sharing these lessons learned and insights with you to help you streamline your BI infrastructure and unlock the full potential of Databricks AI/BI.

Summary In this episode of the Data Engineering Podcast Alex Albu, tech lead for AI initiatives at Starburst, talks about integrating AI workloads with the lakehouse architecture. From his software engineering roots to leading data engineering efforts, Alex shares insights on enhancing Starburst's platform to support AI applications, including an AI agent for data exploration and using AI for metadata enrichment and workload optimization. He discusses the challenges of integrating AI with data systems, innovations like SQL functions for AI tasks and vector databases, and the limitations of traditional architectures in handling AI workloads. Alex also shares his vision for the future of Starburst, including support for new data formats and AI-driven data exploration tools.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.This is a pharmaceutical Ad for Soda Data Quality. Do you suffer from chronic dashboard distrust? Are broken pipelines and silent schema changes wreaking havoc on your analytics? You may be experiencing symptoms of Undiagnosed Data Quality Syndrome — also known as UDQS. Ask your data team about Soda. With Soda Metrics Observability, you can track the health of your KPIs and metrics across the business — automatically detecting anomalies before your CEO does. It’s 70% more accurate than industry benchmarks, and the fastest in the category, analyzing 1.1 billion rows in just 64 seconds. And with Collaborative Data Contracts, engineers and business can finally agree on what “done” looks like — so you can stop fighting over column names, and start trusting your data again.Whether you’re a data engineer, analytics lead, or just someone who cries when a dashboard flatlines, Soda may be right for you. Side effects of implementing Soda may include: Increased trust in your metrics, reduced late-night Slack emergencies, spontaneous high-fives across departments, fewer meetings and less back-and-forth with business stakeholders, and in rare cases, a newfound love of data. Sign up today to get a chance to win a $1000+ custom mechanical keyboard. Visit dataengineeringpodcast.com/soda to sign up and follow Soda’s launch week. It starts June 9th. This episode is brought to you by Coresignal, your go-to source for high-quality public web data to power best-in-class AI products. Instead of spending time collecting, cleaning, and enriching data in-house, use ready-made multi-source B2B data that can be smoothly integrated into your systems via APIs or as datasets. With over 3 billion data records from 15+ online sources, Coresignal delivers high-quality data on companies, employees, and jobs. It is powering decision-making for more than 700 companies across AI, investment, HR tech, sales tech, and market intelligence industries. A founding member of the Ethical Web Data Collection Initiative, Coresignal stands out not only for its data quality but also for its commitment to responsible data collection practices. Recognized as the top data provider by Datarade for two consecutive years, Coresignal is the go-to partner for those who need fresh, accurate, and ethically sourced B2B data at scale. Discover how Coresignal's data can enhance your AI platforms. Visit dataengineeringpodcast.com/coresignal to start your free 14-day trial.Your host is Tobias Macey and today I'm interviewing Alex Albu about how Starburst is extending the lakehouse to support AI workloadsInterview IntroductionHow did you get involved in the area of data management?Can you start by outlining the interaction points of AI with the types of data workflows that you are supporting with Starburst?What are some of the limitations of warehouse and lakehouse systems when it comes to supporting AI systems?What are the points of friction for engineers who are trying to employ LLMs in the work of maintaining a lakehouse environment?Methods such as tool use (exemplified by MCP) are a means of bolting on AI models to systems like Trino. What are some of the ways that is insufficient or cumbersome?Can you describe the technical implementation of the AI-oriented features that you have incorporated into the Starburst platform?What are the foundational architectural modifications that you had to make to enable those capabilities?For the vector storage and indexing, what modifications did you have to make to iceberg?What was your reasoning for not using a format like Lance?For teams who are using Starburst and your new AI features, what are some examples of the workflows that they can expect?What new capabilities are enabled by virtue of embedding AI features into the interface to the lakehouse?What are the most interesting, innovative, or unexpected ways that you have seen Starburst AI features used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI features for Starburst?When is Starburst/lakehouse the wrong choice for a given AI use case?What do you have planned for the future of AI on Starburst?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links StarburstPodcast EpisodeAWS AthenaMCP == Model Context ProtocolLLM Tool UseVector EmbeddingsRAG == Retrieval Augmented GenerationAI Engineering Podcast EpisodeStarburst Data ProductsLanceLanceDBParquetORCpgvectorStarburst IcehouseThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Unified Advanced Analytics: Integrating Power BI and Databricks Genie for Real-time Insights

In today’s data-driven landscape, business users expect seamless, interactive analytics without having to switch between different environments. This presentation explores our web application that unifies a Power BI dashboard with Databricks Genie, allowing users to query and visualize insights from the same dataset within a single, cohesive interface. We will compare two integration strategies: one that leverages a traditional webpage enhanced by an Azure bot to incorporate Genie’s capabilities, and another that utilizes Databricks Apps to deliver a smoother, native experience. We use the Genie API to build this solution. Attendees will learn the architecture behind these solutions, key design considerations and challenges encountered during implementation. Join us to see live demos of both approaches, and discover best practices for delivering an all-in-one, interactive analytics experience.

FinOps: Automated Unity Catalog Cost Observability, Data Isolation and Governance Framework

Westat, a leader in data-driven research for 60 years+, has implemented a centralized Databricks platform to support hundreds of research projects for government, foundations, and private clients. This initiative modernizes Westat’s technical infrastructure while maintaining rigorous statistical standards and streamlining data science. The platform enables isolated project environments with strict data boundaries, centralized oversight, and regulatory compliance. It allows project-specific customization of compute and analytics, and delivers scalable computing for complex analyses. Key features include config-driven Infrastructure as Code (IaC) with Terragrunt, custom tagging and AWS cost integration for ROI tracking, budget policies with alerts for proactive cost management, and a centralized dashboard with row-level security for self-service cost analytics. This unified approach provides full financial visibility and governance while empowering data teams to deliver value. Audio for this session is delivered in the conference mobile app, you must bring your own headphones to listen.

Real-Time Market Insights — Powering Optiver’s Live Trading Dashboard with Databricks Apps and Dash

In the fast-paced world of trading, real-time insights are critical for making informed decisions. This presentation explores how Optiver, a leading high-frequency trading firm, harnesses Databricks apps to power its live trading dashboards. The technology enables traders to analyze market data, detect patterns and respond instantly. In this talk, we will showcase how our system leverages Databricks’ scalable infrastructures such as Structured Streaming to efficiently handle vast streams of financial data while ensuring low-latency performance. In addition, we will show how the integration of Databricks apps with Dash has empowered traders to rapidly develop and deploy custom dashboards, minimizing dependency on developers. Attendees will gain insights into our architecture, data processing techniques and lessons learned in integrating Databricks apps with Dash in order to drive rapid, data-driven trading decisions.

Unity Catalog Upgrades Made Easy. Step-by-Step Guide for Databricks Labs UCX

The Databricks labs project UCX aims to optimize the Unity Catalog (UC) upgrade process, ensuring a seamless transition for businesses. This session will delve into various aspects of the UCX project including the installation and configuration of UCX, the use of the UCX Assessment Dashboard to reduce upgrade risks and prepare effectively for a UC upgrade, and the automation of key components such as group, table and code migration. Attendees will gain comprehensive insights into leveraging UCX and Lakehouse Federation for a streamlined and efficient upgrade process. This session is aimed at customers new to UCX as well as veterans.

Improving User Experience and Efficiency Using DBSQL

To scale Databricks SQL to 2,000 users efficiently and cost-effectively, we adopted serverless, ensuring dynamic scalability and resource optimization. During peak times, resources scale up automatically; during low demand, they scale down, preventing waste. Additionally, we implemented a strong content governance model. We created continuous monitoring to assess query and dashboard performance, notifying users about adjustments and ensuring only relevant content remains active. If a query exceeds time or impact limits, access is reviewed and, if necessary, deactivated. This approach brought greater efficiency, cost reduction and an improved user experience, keeping the platform well-organized and high-performing.

Sponsored by: Astronomer | Scaling Data Teams for the Future

The role of data teams and data engineers is evolving. No longer just pipeline builders or dashboard creators, today’s data teams must evolve to drive business strategy, enable automation, and scale with growing demands. Best practices seen in the software engineering world (Agile development, CI/CD, and Infrastructure-as-code) from the DevOps movement are gradually making their way into data engineering. We believe these changes have led to the rise of DataOps and a new wave of best practices that will transform the discipline of data engineering. But how do you transform a reactive team into a proactive force for innovation? We’ll explore the key principles for building a resilient, high-impact data team—from structuring for collaboration, testing, automation, to leveraging modern orchestration tools. Whether you’re leading a team or looking to future-proof your career, you’ll walk away with actionable insights on how to stay ahead in the rapidly changing data landscape.