talk-data.com talk-data.com

Topic

Data Modelling

data_governance data_quality metadata_management

355

tagged

Activity Trend

18 peak/qtr
2020-Q1 2026-Q1

Activities

355 activities · Newest first

In this touching episode, we center the voice of middle school student leader Karime Reyes who helped to spearhead a radical redesign of her school’s first week last August. In conversation with Peer Resource class teacher Tamar Sberlo, Karime brings insight and depth to what students really want in a classroom, what it means to belong, and how it felt to be part of a school transformation process. We also get a window into Tamar’s “low control” teaching style and the ways she strives to give learners choice and voice. Walk with us as we enter the world of MLK Jr Middle School in San Francisco and see the Street Data model come to life!

For Further Learning: 

Watch Sir Ken Robinson’s mind-blowing video (~12 minutes) on “Changing Education Paradigms” to situate the conversation with Tamar and Karime in a historical context. Courageous Conversations About Race: A Field Guide for Achieving Equity in Schools by Glenn E Singleton For White Folks Who Teach in the Hood… and the Rest of Y’all Too: Reality Pedagogy and Urban Education by Chris Emdin

CockroachDB: The Definitive Guide, 2nd Edition

CockroachDB is the distributed SQL database that handles the demands of today's data-driven applications. The second edition of this popular hands-on guide shows software developers, architects, and DevOps/SRE teams how to use CockroachDB for applications that scale elastically and provide seamless delivery for end users while remaining indestructible. Data professionals will learn how to migrate existing applications to CockroachDB's performant, cloud-native data architecture. You'll also quickly discover the benefits of strong data correctness and consistency guarantees, plus optimizations for delivering ultra-low latencies to globally distributed end users. Uncover the power of distributed SQL Learn how to start, manage, and optimize projects in CockroachDB Explore best practices for data modeling, schema design, and distributed infrastructure Discover strategies for migrating data into CockroachDB See how to read, write, and run ACID transactions across distributed systems Maximize resiliency in multiregion clusters Secure, monitor, and fine-tune your CockroachDB deployment for peak performance

Summary In this episode of the Data Engineering Podcast Bartosz Mikulski talks about preparing data for AI applications. Bartosz shares his journey from data engineering to MLOps and emphasizes the importance of data testing over software development in AI contexts. He discusses the types of data assets required for AI applications, including extensive test datasets, especially in generative AI, and explains the differences in data requirements for various AI application styles. The conversation also explores the skills data engineers need to transition into AI, such as familiarity with vector databases and new data modeling strategies, and highlights the challenges of evolving AI applications, including frequent reprocessing of data when changing chunking strategies or embedding models.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details. Your host is Tobias Macey and today I'm interviewing Bartosz Mikulski about how to prepare data for use in AI applicationsInterview IntroductionHow did you get involved in the area of data management?Can you start by outlining some of the main categories of data assets that are needed for AI applications?How does the nature of the application change those requirements? (e.g. RAG app vs. agent, etc.)How do the different assets map to the stages of the application lifecycle?What are some of the common roles and divisions of responsibility that you see in the construction and operation of a "typical" AI application?For data engineers who are used to data warehousing/BI, what are the skills that map to AI apps?What are some of the data modeling patterns that are needed to support AI apps?chunking strategies metadata managementWhat are the new categories of data that data engineers need to manage in the context of AI applications?agent memory generation/evolution conversation history managementdata collection for fine tuningWhat are some of the notable evolutions in the space of AI applications and their patterns that have happened in the past ~1-2 years that relate to the responsibilities of data engineers?What are some of the skills gaps that teams should be aware of and identify training opportunities for?What are the most interesting, innovative, or unexpected ways that you have seen data teams address the needs of AI applications?What are the most interesting, unexpected, or challenging lessons that you have learned while working on AI applications and their reliance on data?What are some of the emerging trends that you are paying particular attention to?Contact Info WebsiteLinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links SparkRayChunking StrategiesHypothetical document embeddingsModel Fine TuningPrompt CompressionThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

The Well-Grounded Data Analyst

Complete eight data science projects that lock in important real-world skills—along with a practical process you can use to learn any new technique quickly and efficiently. Data analysts need to be problem solvers—and The Well-Grounded Data Analyst will teach you how to solve the most common problems you'll face in industry. You'll explore eight scenarios that your class or bootcamp won’t have covered, so you can accomplish what your boss is asking for. In The Well-Grounded Data Analyst you'll learn: High-value skills to tackle specific analytical problems Deconstructing problems for faster, practical solutions Data modeling, PDF data extraction, and categorical data manipulation Handling vague metrics, deciphering inherited projects, and defining customer records The Well-Grounded Data Analyst is for junior and early-career data analysts looking to supplement their foundational data skills with real-world problem solving. As you explore each project, you'll also master a proven process for quickly learning new skills developed by author and Half Stack Data Science podcast host David Asboth. You'll learn how to determine a minimum viable answer for your stakeholders, identify and obtain the data you need to deliver, and reliably present and iterate on your findings. The book can be read cover-to-cover or opened to the chapter most relevant to your current challenges. About the Technology Real world data analysis is messy. Success requires tackling challenges like unreliable data sources, ambiguous requests, and incompatible formats—often with limited guidance. This book goes beyond the clean, structured examples you see in classrooms and bootcamps, offering a step-by-step framework you can use to confidently solve any data analysis problem like a pro. About the Book The Well-Grounded Data Analyst introduces you to eight scenarios that every data analyst is bound to face. You’ll practice author David Asboth’s results-oriented approach as you model data by identifying customer records, navigate poorly-defined metrics, extract data from PDFs, and much more! It also teaches you how to take over incomplete projects and create rapid prototypes with real data. Along the way, you’ll build an impressive portfolio of projects you can showcase at your next interview. What's Inside Deconstructing problems Handling vague metrics Data modeling Categorical data manipulation About the Reader For early-career data scientists. About the Author David Asboth is a data generalist educator, and software architect. He co-hosts the Half Stack Data Science podcast. Quotes Well reasoned and well written, with approaches to solve many sorts of data analysis problems. - Naomi Ceder, Fellow of the Python Software Foundation An excellent resource for any aspiring data scientist! - Andrew R. Freed, IBM David’s clear and repeatable framework will give you confidence to tackle open-ended stakeholder requests and reach an answer much faster! - Shaun McGirr, DevOn Software Services A book version of shadowing a senior data analyst while they explain handling frequent data problems at work, including all the ugly gotchas. - Randy Au, Google

It's 2025! We made it! ;)

In this podcast, I rant about why data modeling matters more than ever, AI, and why humans will seek out "human" things in 2025 and beyond.

❤️ Your support means a lot. Please like and rate this podcast on your favorite podcast platform.

🤓 My works:

📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/

🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering

🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/

🤓 My SubStack: https://joereis.substack.com/

It's December 31, 2024. Gordon Wong and I wrap up 2024 and chat about what we're excited about in 2025 in data and otherwise.

❤️ If you like my podcasts, please like and rate it on your favorite podcast platform.

🤓 My works:

📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/

🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering

🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/

🤓 My SubStack: https://joereis.substack.com/

Matt Housley and I have a LONG chat about working in consulting, leaving your job, AI, the job market, our thoughts on what's coming in 2025, and much more.

❤️ If you like my podcasts, please like and rate it on your favorite podcast platform.

🤓 My works:

📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/

🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering

🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/

🤓 My SubStack: https://joereis.substack.com/

This morning, a great article came across my feed that gave me PTSD, asking if Iceberg is the Hadoop of the Modern Data Stack?

In this rant, I bring the discussion back to a central question you should ask with any hot technology - do you need it at all? Do you need a tool built for the top 1% of companies at a sufficient data scale? Or is a spreadsheet good enough?

Link: https://blog.det.life/apache-iceberg-the-hadoop-of-the-modern-data-stack-c83f63a4ebb9

❤️ If you like my podcasts, please like and rate it on your favorite podcast platform.

🤓 My works:

📕Fundamentals of Data Engineering: https://www.oreilly.com/library/view/fundamentals-of-data/9781098108298/

🎥 Deeplearning.ai Data Engineering Certificate: https://www.coursera.org/professional-certificates/data-engineering

🔥Practical Data Modeling: https://practicaldatamodeling.substack.com/

🤓 My SubStack: https://joereis.substack.com/

Summary In this episode of the Data Engineering Podcast the inimitable Max Beauchemin talks about reusability in data pipelines. The conversation explores the "write everything twice" problem, where similar pipelines are built without code reuse, and discusses the challenges of managing different SQL dialects and relational databases. Max also touches on the evolving role of data engineers, drawing parallels with front-end engineering, and suggests that generative AI could facilitate knowledge capture and distribution in data engineering. He encourages the community to share reference implementations and templates to foster collaboration and innovation, and expresses hopes for a future where code reuse becomes more prevalent.

Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm joined again by Max Beauchemin to talk about the challenges of reusability in data pipelinesInterview IntroductionHow did you get involved in the area of data management?Can you start by sharing your current thesis on the opportunities and shortcomings of code and component reusability in the data context?What are some ways that you think about what constitutes a "component" in this context?The data ecosystem has arguably grown more varied and nuanced in recent years. At the same time, the number and maturity of tools has grown. What is your view on the current trend in productivity for data teams and practitioners?What do you see as the core impediments to building more reusable and general-purpose solutions in data engineering?How can we balance the actual needs of data consumers against their requests (whether well- or un-informed) to help increase our ability to better design our workflows for reuse?In data engineering there are two broad approaches; code-focused or SQL-focused pipelines. In principle one would think that code-focused environments would have better composability. What are you seeing as the realities in your personal experience and what you hear from other teams?When it comes to SQL dialects, dbt offers the option of Jinja macros, whereas SDF and SQLMesh offer automatic translation. There are also tools like PRQL and Malloy that aim to abstract away the underlying SQL. What are the tradeoffs across those options that help or hinder the portability of transformation logic?Which layers of the data stack/steps in the data journey do you see the greatest opportunity for improving the creation of more broadly usable abstractions/reusable elements?low/no code systems for code reuseimpact of LLMs on reusability/compositionimpact of background on industry practices (e.g. DBAs, sysadmins, analysts vs. SWE, etc.)polymorphic data models (e.g. activity schema)What are the most interesting, innovative, or unexpected ways that you have seen teams address composability and reusability of data components?What are the most interesting, unexpected, or challenging lessons that you have learned while working on data-oriented tools and utilities?What are your hopes and predictions for sharing of code and logic in the future of data engineering?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links Max's Blog PostAirflowSupersetTableauLookerPowerBICohort AnalysisNextJSAirbytePodcast EpisodeFivetranPodcast EpisodeSegmentdbtSQLMeshPodcast EpisodeSparkLAMP StackPHPRelational AlgebraKnowledge GraphPython MarshmallowData Warehouse Lifecycle Toolkit (affiliate link)Entity Centric Data Modeling Blog PostAmplitudeOSACon presentationol-data-platform Tobias' team's data platform codeThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Gunnar Morling: Data Contracts In Practice With Debezium and Apache Flink

🌟 Session Overview 🌟

Session Name: Data Contracts In Practice With Debezium and Apache Flink Speaker: Gunnar Morling Session Description: Log-based change data capture (CDC) is an invaluable part of the data engineering toolbox: it enables a variety of use cases such as real-time analytics, full-text search, or cache invalidation by publishing data change events from your database. But when publishing change event streams across context or team boundaries, aren’t you tying external consumers to your application’s data model, thus limiting yourself in evolving the same?

Enter data contracts—consciously designed abstractions between your internal data model and the outside world. Come and join us for this session to learn about:

Challenges you may encounter when exposing table-level change event streams and how data contracts can mitigate them Implementation strategies for data contracts, such as the outbox pattern and stream processing Evolving your data model and the corresponding data contracts without breaking any existing consumers We’ll also touch on some advanced topics at the intersection of CDC and stream processing, such as hydrating partial change events, using the popular change stream processing duo of Debezium and Apache Flink.

🚀 About Big Data and RPA 2024 🚀

Unlock the future of innovation and automation at Big Data & RPA Conference Europe 2024! 🌟 This unique event brings together the brightest minds in big data, machine learning, AI, and robotic process automation to explore cutting-edge solutions and trends shaping the tech landscape. Perfect for data engineers, analysts, RPA developers, and business leaders, the conference offers dual insights into the power of data-driven strategies and intelligent automation. 🚀 Gain practical knowledge on topics like hyperautomation, AI integration, advanced analytics, and workflow optimization while networking with global experts. Don’t miss this exclusive opportunity to expand your expertise and revolutionize your processes—all from the comfort of your home! 📊🤖✨

📅 Yearly Conferences: Curious about the evolution of QA? Check out our archive of past Big Data & RPA sessions. Watch the strategies and technologies evolve in our videos! 🚀 🔗 Find Other Years' Videos: 2023 Big Data Conference Europe https://www.youtube.com/playlist?list=PLqYhGsQ9iSEpb_oyAsg67PhpbrkCC59_g 2022 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEryAOjmvdiaXTfjCg5j3HhT 2021 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEqHwbQoWEXEJALFLKVDRXiP

💡 Stay Connected & Updated 💡

Don’t miss out on any updates or upcoming event information from Big Data & RPA Conference Europe. Follow us on our social media channels and visit our website to stay in the loop!

🌐 Website: https://bigdataconference.eu/, https://rpaconference.eu/ 👤 Facebook: https://www.facebook.com/bigdataconf, https://www.facebook.com/rpaeurope/ 🐦 Twitter: @BigDataConfEU, @europe_rpa 🔗 LinkedIn: https://www.linkedin.com/company/73234449/admin/dashboard/, https://www.linkedin.com/company/75464753/admin/dashboard/ 🎥 YouTube: http://www.youtube.com/@DATAMINERLT

Pietro Mele, Radu Pop: Exploring Large Graphs at the Heart of the French National Audiovisual Ins.

🌟 Session Overview 🌟

Session Name: Exploring Large Graphs at the Heart of the French National Audiovisual Institute Speaker: Pietro Mele, Radu Pop Session Description: The National Audiovisual Institute (INA) serves as the repository for all French audiovisual archives, maintaining a continuous archiving effort for over 180 radio and television services around the clock since 1995. The metadata generated from this vast collection equates to hundreds of billions of documents, encompassing images, audio and video fragments, and text excerpts.

Given the diverse nature of the content, the data model is inspired by the conceptual frameworks of cultural heritage, structured as an extensive graph with intricate relationships between generic entities.

Building a global search engine for this unique use case presents a dual challenge: ensuring rapid indexing and implementing sophisticated full-text search capabilities with high performance. The presentation will explore the crucial decisions made and the technical infrastructure developed to meet these challenges effectively, starting from data injection and indexing strategies to the implementation of advanced search algorithms and scalable storage solutions.

🚀 About Big Data and RPA 2024 🚀

Unlock the future of innovation and automation at Big Data & RPA Conference Europe 2024! 🌟 This unique event brings together the brightest minds in big data, machine learning, AI, and robotic process automation to explore cutting-edge solutions and trends shaping the tech landscape. Perfect for data engineers, analysts, RPA developers, and business leaders, the conference offers dual insights into the power of data-driven strategies and intelligent automation. 🚀 Gain practical knowledge on topics like hyperautomation, AI integration, advanced analytics, and workflow optimization while networking with global experts. Don’t miss this exclusive opportunity to expand your expertise and revolutionize your processes—all from the comfort of your home! 📊🤖✨

📅 Yearly Conferences: Curious about the evolution of QA? Check out our archive of past Big Data & RPA sessions. Watch the strategies and technologies evolve in our videos! 🚀 🔗 Find Other Years' Videos: 2023 Big Data Conference Europe https://www.youtube.com/playlist?list=PLqYhGsQ9iSEpb_oyAsg67PhpbrkCC59_g 2022 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEryAOjmvdiaXTfjCg5j3HhT 2021 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEqHwbQoWEXEJALFLKVDRXiP

💡 Stay Connected & Updated 💡

Don’t miss out on any updates or upcoming event information from Big Data & RPA Conference Europe. Follow us on our social media channels and visit our website to stay in the loop!

🌐 Website: https://bigdataconference.eu/, https://rpaconference.eu/ 👤 Facebook: https://www.facebook.com/bigdataconf, https://www.facebook.com/rpaeurope/ 🐦 Twitter: @BigDataConfEU, @europe_rpa 🔗 LinkedIn: https://www.linkedin.com/company/73234449/admin/dashboard/, https://www.linkedin.com/company/75464753/admin/dashboard/ 🎥 YouTube: http://www.youtube.com/@DATAMINERLT

Luca Pescatore: Data Modelling: Navigating the Data Seas with a Data Vault Compass

🌟 Session Overview 🌟

Session Name: Data Modelling: Navigating the Data Seas with a Data Vault Compass Speaker: Luca Pescatore Session Description: Nowadays, data is everywhere. Companies are often drowning in a sea of data, and their data lakes turn into data swamps. Order is needed.

Dive into the evolution of data modeling and learn about use cases and advantages of different modeling structures. In particular, this talk will discuss how Data Vault modeling can be a navigational beacon in the data ocean.

🚀 About Big Data and RPA 2024 🚀

Unlock the future of innovation and automation at Big Data & RPA Conference Europe 2024! 🌟 This unique event brings together the brightest minds in big data, machine learning, AI, and robotic process automation to explore cutting-edge solutions and trends shaping the tech landscape. Perfect for data engineers, analysts, RPA developers, and business leaders, the conference offers dual insights into the power of data-driven strategies and intelligent automation. 🚀 Gain practical knowledge on topics like hyperautomation, AI integration, advanced analytics, and workflow optimization while networking with global experts. Don’t miss this exclusive opportunity to expand your expertise and revolutionize your processes—all from the comfort of your home! 📊🤖✨

📅 Yearly Conferences: Curious about the evolution of QA? Check out our archive of past Big Data & RPA sessions. Watch the strategies and technologies evolve in our videos! 🚀 🔗 Find Other Years' Videos: 2023 Big Data Conference Europe https://www.youtube.com/playlist?list=PLqYhGsQ9iSEpb_oyAsg67PhpbrkCC59_g 2022 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEryAOjmvdiaXTfjCg5j3HhT 2021 Big Data Conference Europe Online https://www.youtube.com/playlist?list=PLqYhGsQ9iSEqHwbQoWEXEJALFLKVDRXiP

💡 Stay Connected & Updated 💡

Don’t miss out on any updates or upcoming event information from Big Data & RPA Conference Europe. Follow us on our social media channels and visit our website to stay in the loop!

🌐 Website: https://bigdataconference.eu/, https://rpaconference.eu/ 👤 Facebook: https://www.facebook.com/bigdataconf, https://www.facebook.com/rpaeurope/ 🐦 Twitter: @BigDataConfEU, @europe_rpa 🔗 LinkedIn: https://www.linkedin.com/company/73234449/admin/dashboard/, https://www.linkedin.com/company/75464753/admin/dashboard/ 🎥 YouTube: http://www.youtube.com/@DATAMINERLT

AWS re:Invent 2024 - Data modeling core concepts for Amazon DynamoDB (DAT305)

Join this session to learn the core concepts of Amazon DynamoDB data modeling. Explore best practices for common access patterns used by DynamoDB customers for applications that need consistent, fast performance at any scale. Developers experienced with DynamoDB can learn best practices and trade-offs to make when deciding on single-table and multi-table designs, indexing strategies, and more.

Learn more: AWS re:Invent: https://go.aws/reinvent. More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

About AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2024

AWS re:Invent 2024 - Advanced data modeling with Amazon DynamoDB (DAT404)

Amazon DynamoDB is a popular choice for modern applications because it’s a serverless database that provides single-digit millisecond performance at any scale. Optimizing your usage of DynamoDB requires a different approach to data modeling than traditional relational databases. In this session, AWS Data Hero Alex DeBrie shows you advanced techniques to help you get the most out of DynamoDB. Learn how to “think in DynamoDB” by learning the DynamoDB foundations and principles for data modeling. Further, learn practical strategies and DynamoDB features to handle difficult use cases in your application.

Learn more: AWS re:Invent: https://go.aws/reinvent. More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

About AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2024

AWS re:Invent 2024 - Advanced data modeling for Amazon ElastiCache (DAT422)

This session delves into the intricacies of Amazon ElastiCache data modeling using the purpose-built Valkey data types to optimize application performance and scalability. Explore the use of strings, sets, sorted sets, hashes, bitmaps, and geospatial indexes to model complex relationships and solve use cases such as caching, session store, feature store, real-time analytics, geospatial applications, and rate limiters.

Learn more: AWS re:Invent: https://go.aws/reinvent. More AWS events: https://go.aws/3kss9CP

Subscribe: More AWS videos: http://bit.ly/2O3zS75 More AWS events videos: http://bit.ly/316g9t4

About AWS: Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts. AWS is the world's most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

AWSreInvent #AWSreInvent2024

Managing Data as a Product

Discover how to transform your data architecture with the insights and techniques presented in Managing Data as a Product by Andrea Gioia. In this comprehensive guide, you'll explore how to design, implement, and maintain data-product-centered systems to meet modern demands, achieving scalable and sustainable data management tailored to your organization's needs. What this Book will help me do Understand the principles of data-product-centered architectures and their advantages. Learn to design, develop, and operate data products in production settings. Explore strategies to manage the lifecycle of data products efficiently. Gain insights into team topologies and data ownership for distributed systems. Discover data modeling techniques for AI-ready architectures. Author(s) Andrea Gioia is a renowned data architect and the creator of the Open Data Mesh Initiative. With over 20 years of experience, Andrea has successfully led complex data projects and is passionate about sharing his expertise. His writing is practical and driven by real-world challenges, aiming to equip engineers with actionable knowledge. Who is it for? This book is ideal for data engineers, software architects, and engineering leaders involved in shaping innovative data architectures. If you have foundational knowledge of data engineering and are eager to advance your expertise by adopting data-product principles, this book will suit your needs. It is for professionals aiming to modernize and optimize their approach to organizational data management.

Cosmic efficiency: mastering performance and cost in Azure Cosmos DB | BRK194

Are you ready to build high-performance, cost-efficient apps? Join us to master Azure Cosmos DB! This session covers serverless vs. autoscale options, data modeling, query optimization, and advanced features for top performance. Learn how to quickly build apps that can easily grow to planetary scale with data partitioning and fine-tuning strategies. Join us to hear about best practices and insider tips to design cloud-native apps with unmatched performance, scalability, and cost savings!

𝗦𝗽𝗲𝗮𝗸𝗲𝗿𝘀: * Estefani Arroyo * Tara Bhatia

𝗦𝗲𝘀𝘀𝗶𝗼𝗻 𝗜𝗻𝗳𝗼𝗿𝗺𝗮𝘁𝗶𝗼𝗻: This is one of many sessions from the Microsoft Ignite 2024 event. View even more sessions on-demand and learn about Microsoft Ignite at https://ignite.microsoft.com

BRK194 | English (US) | Data

MSIgnite