talk-data.com talk-data.com

Juan Sequeda

Speaker

Juan Sequeda

10

talks

Principal Scientist and Head of AI Lab data.world

Juan Sequeda is a Principal Fundamental Researcher at ServiceNow, joining through the acquisition of data.world. He holds a PhD in Computer Science from The University of Texas at Austin. Juan’s research and industry work has been on the intersection of data and AI, with the goal to reliably create knowledge from inscrutable data, specifically designing and building Knowledge Graph for enterprise data and metadata management. Juan is the co-author of the book “Designing and Building Enterprise Knowledge Graph” and the co-host of Catalog and Cocktails, an honest, no-bs, non-salesy data podcast.

Bio from: Dbt Coalesce 2024

Frequent Collaborators

Filter by Event / Source

Talks & appearances

10 activities · Newest first

Search activities →
Face To Face
with Tim Gasper (data.world from ServiceNow) , Juan Sequeda (data.world)

In the era of AI agents, data products, and exploding complexity, enterprises face an unprecedented challenge: how to give machines and people, the context needed to make decisions that are accurate, explainable, and trustworthy. 

What if data teams could solve this? 

We'll explore a new opportunity: data teams as the architects and stewards of the “Enterprise Brain”, a unified knowledge layer that connects business concepts, data, and processes across the organization. By treating context as a first-class citizen, organizations can bridge structured and unstructured worlds, govern data and AI agents effectively, and unlock faster, more confident decision-making. 

We'll unpack: 

• Why context and knowledge are becoming the missing pieces of AI readiness. 

• How structured and unstructured data are converging, and why this demands a new governance model. 

• Why ontologies, knowledge graphs, and active governance are critical to managing the coming explosion of AI agents. 

• The unique chance data teams have right now to lead and move from building pipelines to building the enterprise brain. 

We won’t talk tools. We’ll talk possibility. By the end, you’ll see why the most successful organizations of the next decade will be the ones who make their data teams the creators and owners of knowledge itself. 

In a world where AI agents, complex workflows, and accelerating data demands are reshaping every enterprise, the challenge isn’t just managing data, it’s creating trusted context that connects people, processes, and technology. 

Join Rebecca O’Kill, Chief Data & Analytics Officer at Axis Capital, for an Honest No-BS conversation about how her team is transforming governance from a compliance checkbox into a strategic enabler of business value. 

Together, we’ll unpack: 

• Minimal Valuable Governance (MVG): why the old ivory tower “govern everything” mindset fails, and how focusing on just enough governance creates immediate business impact. 

• The ACTIVE framework, a practical approach for governance built on: Alignment, Clarity, Trust, Iterative, Value, Enablement 

• How Axis Capital is embedding governance across the organization by uniting the “front office” (what and why) with the “back office” (how). 

• Why context and knowledge are critical for the next era of agentic AI and multi-agent workflows, and how Axis is preparing for it today. 

By the end, you’ll see how Axis Capital is turning governance into a competitive advantage and why this approach is essential for any organization looking to thrive in a world of AI-driven automation and connected workflows. 

As enterprises race to unlock AI, many face barriers like poor metadata and weak governance. In this session, Rebecca O’Kill (CDAO of Axis Capital), Tim Gasper, and Juan Sequeda share how AI is not just the outcome of governance—it’s the incentive. Framing AI as the “carrot” motivates adoption of governance as a strategic enabler. Learn how AI-powered governance, data marketplaces, and knowledge graphs together provide context, drive smarter metadata, and enable impactful AI use cases like underwriting agents that require structured and unstructured data.

Coalesce 2024: What does enterprise AI lose by not investing in semantics and knowledge?

In this talk, we will make the case that the success of enterprise AI depends on an investment in semantics and knowledge, not just data. Our LLM Accuracy benchmark research provided evidence that by layering semantic layers/knowledge graphs on enterprise SQL databases increases the accuracy of LLMs at least 4X for question answering. This work has been reproduced and validated by many others, including dbt labs. It's fantastic that semantics and knowledge are getting the attention it deserves. We need more.

This talk is targeted to 1) those who believe AI accuracy can be improved by simply adding more data to fine-tune/train models, and 2) the believers in semantics and knowledge who need help getting executive buy-in.

We will dive into: - the knowledge engineering work that needs to be done - who should be leading this work (hint: analytics engineers) - what companies lose by not doing this knowledge engineering work

Speaker: Juan Sequeda Principal Scientist and Head of AI Lab data.world

Read the blog to learn about the latest dbt Cloud features announced at Coalesce, designed to help organizations embrace analytics best practices at scale https://www.getdbt.com/blog/coalesce-2024-product-announcements

Juan Sequeda is a principal data scientist and head of the AI Lab at data.world, and is also the co-host of the fantastic data podcast Catalog and Cocktails.  This episode tackles semantics, semantic web, Juan's research in how raw text-to-SQL performs versus text-to-semantic layer,  and where we both believe AI will make an impact in the world of structured data analytics. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

podcast_episode
with Santona Tuli (Upsolver) , Tim Gasper (data.world from ServiceNow) , Juan Sequeda (data.world) , Joe Reis (DeepLearning.AI)

Are your outputs generating the right outcomes? I'm in Austin for Data Day Texas, and I reflect on this topic via a conversation I had last night with Juan Sequeda, Tim Gasper, and Santona Tuli.

In 2024, outcomes will matter more than ever. What are you doing to drive the right outcomes for your organization?

Juan Sequeda and I chat about knowledge graphs (he's an OG in this area), the potential of LLMs on structured datasets, and much more. This is an honest, no-BS chat about the transition from a data-first world to a knowledge-first world. Enjoy!

LinkedIn: https://www.linkedin.com/in/juansequeda/

data.world: https://data.world/product/

website: https://www.juansequeda.com/

Summary

The data ecosystem has seen a constant flurry of activity for the past several years, and it shows no signs of slowing down. With all of the products, techniques, and buzzwords being discussed it can be easy to be overcome by the hype. In this episode Juan Sequeda and Tim Gasper from data.world share their views on the core principles that you can use to ground your work and avoid getting caught in the hype cycles.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management When you're ready to build your next pipeline, or want to test out the projects you hear about on the show, you'll need somewhere to deploy it, so check out our friends at Linode. With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Go to dataengineeringpodcast.com/linode today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don't forget to thank them for their continued support of this show! Modern data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days or even weeks. By the time errors have made their way into production, it’s often too late and damage is done. Datafold built automated regression testing to help data and analytics engineers deal with data quality in their pull requests. Datafold shows how a change in SQL code affects your data, both on a statistical level and down to individual rows and values before it gets merged to production. No more shipping and praying, you can now know exactly what will change in your database! Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. Visit dataengineeringpodcast.com/datafold today to book a demo with Datafold. RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools. Sign up free at dataengineeringpodcast.com/rudder Build Data Pipelines. Not DAGs. That’s the spirit behind Upsolver SQLake, a new self-service data pipeline platform that lets you build batch and streaming pipelines without falling into the black hole of DAG-based orchestration. All you do is write a query in SQL to declare your transformation, and SQLake will turn it into a continuous pipeline that scales to petabytes and delivers up to the minute fresh data. SQLake supports a broad set of transformations, including high-cardinality joins, aggregations, upserts and window operations. Output data can be streamed into a data lake for query engines like Presto, Trino or Spark SQL, a data warehouse like Snowflake or Redshift., or any other destination you choose. Pricing for SQLake is simple. You pay $99 per terabyte ingested into your data lake using SQLake, and run unlimited transformation pipelines for free. That way data engineers and data users can process to their heart’s content without worrying about their cloud bill. For data engineering podcast listeners, we’re offering a 30 day trial with unlimited data, so go to dataengineeringpodcast.com/upsolver today and see for yourself how to avoid DAG hell. Your host is Tobias Macey and today I'm interviewing Juan Sequeda and Tim Gasper about their views on the role of the data mesh paradigm for driving re-assessment of the foundational principles of data systems