talk-data.com
People (11 results)
See all 11 →Activities & events
| Title & Speakers | Event |
|---|---|
|
How a Fitness App Learned to Talk to Its Data: The Push30 Story
2025-10-14 · 20:10
Anvar Atash
– Co-Founder and CEO
@ SirDash
A story about Push30's data storytelling and data architecture in a fitness app. |
|
|
Lakekeeper — The Lakehouse Control Plane: Governance at Scale with Iceberg
2025-10-14 · 19:35
Governance at scale for lakehouse architectures powered by Iceberg. |
|
|
Adapting ClickHouse® to Use Apache Iceberg Storage
2025-10-14 · 19:10
Josh Lee
– Open Source Advocate
@ Altinity
,
Maciej Bak
– Support Engineer
@ Altinity
A talk about adapting ClickHouse to use Apache Iceberg storage. |
|
|
Governance at Scale with Iceberg: Unlocking Metadata through Lakekeeper
2025-10-09 · 07:30
Apache Iceberg brings powerful metadata to the table—but how do you turn it into governance that scales without slowing teams down? In this talk, we’ll explore how Lakekeeper builds on Iceberg’s foundation to make data management, fine-grained access control, and cross-platform interoperability seamless. Learn how metadata is becoming the backbone of modern data platforms, and why that matters for anyone using Iceberg together with engines like ClickHouse. |
|
|
Friedrich Rockenbauer, Daedalean
2025-10-09 · 07:10
|
|
|
Mark Frey, Jua AI
2025-10-09 · 06:50
AI/ML
|
|
|
ClickHouse - Latest features and roadmap
2025-10-09 · 06:30
ClickHouse
|
|
|
Evolution & Future d’Apache Iceberg (FR)
2025-06-19 · 19:30
Iceberg
|
|
|
L’avènement du Lac de données ouvertes (FR)
2025-06-19 · 18:55
|
|
|
PyData Slovakia #31 & R<-Slovakia Meetup [Viktor Kessler: Apache Iceberg]
2025-06-13 · 14:00
Talk Title: "Governing the Lakehouse: Metadata-Driven Control with Apache Iceberg Catalogs" Description: Apache Iceberg has redefined how data is stored and queried in modern lakehouses by introducing a table format that supports ACID transactions, time travel, and schema evolution. At the heart of this transformation lies the Iceberg Catalog—a critical component that manages table metadata and connects distributed storage systems with compute engines. Catalogs play a central role in enabling metadata-driven governance, allowing data teams to enforce consistency, traceability, and access control at scale. In this session, we explore how Iceberg’s metadata model empowers key governance capabilities such as auditability, reproducibility, multi-engine interoperability, and simplified lineage tracking. But while Iceberg provides a solid foundation, essential governance features are still emerging. We'll examine what’s missing today: fine-grained policy enforcement, unified access control, real-time metadata observability, and first-class support for data contracts. As Iceberg adoption grows, evolving the catalog layer will be key to achieving enterprise-grade governance in open lakehouse architectures. Speaker/Bio: Viktor Kessler Viktor Kessler \| Co-Founder @ Vakamo https://www.linkedin.com/in/viktor-kessler/ & vakamo.com + docs.lakekeeper.io Open lakehouse ecosystems need more than raw power — they need governance, compatibility and freedom to evolve. In this forward-thinking session, Viktor Kessler, Co-Founder of Vakamo, breaks down the REST Catalog API of Apache Iceberg. You’ll also meet Lakekeeper, an open-source solution extending the REST standard with metadata-driven policy enforcement. After Viktor's main talk (45-60 min) there will be about 1,5 hour long R <- Slovakia / R User Group (sub-group of PyData Slovakia) meetup afterwards. Language of the event: English Moderator and Host of the event: Radovan Kavický, President & Principal Data Scientist @ GapData Institute; former AI & Data Science Evangelist @ AIslovakIA - National platform for AI development in Slovakia Registration: @Meetup.com group's event here (https://www.meetup.com/pydata-slovakia-bratislava/events/307510355/) & @Eventbrite registration here (https://www.eventbrite.com/e/pydata-slovakia-meetup-30-maryam-alimardani-navigating-the-phd-journey-tickets-1341011775319?aff=oddtdtcreator). +our event you can find also @Facebook here (https://www.facebook.com/events/1823801225067220) and LinkedIn here (https://www.linkedin.com/events/7337730825595072513/about/). [Disclaimer: If you just mark "going" @Facebook event we can't guarantee your seat] Language of the event: English PyData Bratislava [Python Data Enthusiasts and Users, Data Scientists & Statisticians of all levels from Slovakia]-- PyData is a group for users and developers of data analysis tools to share ideas and learn from each other. We gather to discuss how best to apply Python tools, as well as those using R and Julia, to meet the evolving challenges in data management, processing, analytics, and visualization. PyData is organized by NumFOCUS.org, a 501(c)3 non-profit in the United States. The PyData Code of Conduct governs this meetup. To discuss any issues or concerns relating to the code of conduct or the behavior of anyone at a PyData meetup, please contact the organizer or NumFOCUS Executive Director Leah Silen (+1512-222-5449; [email protected]).Our Facebook group you can find here: https://www.facebook.com/groups/1813599648877946/ Our Twitter account here: https://twitter.com/PyDataBA Our LinkedIn group here: https://www.linkedin.com/groups/13506080 R <- Slovakia (#RSlovakia) is sub-group of PyData Slovakia (#PyDataSK) & Bratislava (#PyDataBA) and an active place for discussion between R Enthusiasts and Users, Data Scientists, Economists and Statisticians of all levels in Slovakia using R for data analysis and data visualization. Powered by NumFOCUS, The R Foundation & GapData Institute. The goals are to build R community in Slovakia and to provide R enthusiasts a place to share ideas and learn from each other about how best to apply R to ever-evolving challenges in the vast realm of data analytics, management, processing and visualization. We share here interesting articles, links to books or work of others in R you find elsewhere, asking questions you have while working on an R project/visualization, as well as presenting results of your own work in R. On LinkedIn here: https://www.linkedin.com/groups/13503959 On Twitter you can follow us here: https://twitter.com/PyDataBA/ Organizers: GapData Institute (https://www.gapdata.org/) (GDI) is a nonprofit nonpartisan research institution harnessing power of data & wisdom of economics for public good. \|\| Data. Think. Change. \|\|NumFOCUS (http://www.numfocus.org/) is a 501(c)(3) nonprofit that supports and promotes world-class, innovative, open source scientific computing. The mission of NumFOCUS is to promote sustainable high-level programming languages, open code development, and reproducible scientific research. |
PyData Slovakia #31 & R<-Slovakia Meetup [Viktor Kessler: Apache Iceberg]
|
|
Advanced Lakehouse Management With The LakeKeeper Iceberg REST Catalog
2025-04-21 · 00:45
Viktor Kessler
– Co-founder
@ Vakmo
,
Tobias Macey
– host
Summary In this episode of the Data Engineering Podcast Viktor Kessler, co-founder of Vakmo, talks about the architectural patterns in the lake house enabled by a fast and feature-rich Iceberg catalog. Viktor shares his journey from data warehouses to developing the open-source project, Lakekeeper, an Apache Iceberg REST catalog written in Rust that facilitates building lake houses with essential components like storage, compute, and catalog management. He discusses the importance of metadata in making data actionable, the evolution of data catalogs, and the challenges and innovations in the space, including integration with OpenFGA for fine-grained access control and managing data across formats and compute engines. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data managementData migrations are brutal. They drag on for months—sometimes years—burning through resources and crushing team morale. Datafold's AI-powered Migration Agent changes all that. Their unique combination of AI code translation and automated data validation has helped companies complete migrations up to 10 times faster than manual approaches. And they're so confident in their solution, they'll actually guarantee your timeline in writing. Ready to turn your year-long migration into weeks? Visit dataengineeringpodcast.com/datafold today for the details.Your host is Tobias Macey and today I'm interviewing Viktor Kessler about architectural patterns in the lakehouse that are unlocked by a fast and feature-rich Iceberg catalogInterview IntroductionHow did you get involved in the area of data management?Can you describe what LakeKeeper is and the story behind it? What is the core of the problem that you are addressing?There has been a lot of activity in the catalog space recently. What are the driving forces that have highlighted the need for a better metadata catalog in the data lake/distributed data ecosystem?How would you characterize the feature sets/problem spaces that different entrants are focused on addressing?Iceberg as a table format has gained a lot of attention and adoption across the data ecosystem. The REST catalog format has opened the door for numerous implementations. What are the opportunities for innovation and improving user experience in that space?What is the role of the catalog in managing security and governance? (AuthZ, auditing, etc.)What are the channels for propagating identity and permissions to compute engines? (how do you avoid head-scratching about permission denied situations)Can you describe how LakeKeeper is implemented?How have the design and goals of the project changed since you first started working on it?For someone who has an existing set of Iceberg tables and catalog, what does the migration process look like?What new workflows or capabilities does LakeKeeper enable for data teams using Iceberg tables across one or more compute frameworks?What are the most interesting, innovative, or unexpected ways that you have seen LakeKeeper used?What are the most interesting, unexpected, or challenging lessons that you have learned while working on LakeKeeper?When is LakeKeeper the wrong choice?What do you have planned for the future of LakeKeeper?Contact Info LinkedInParting Question From your perspective, what is the biggest gap in the tooling or technology for data management today?Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.init covers the Python language, its community, and the innovative ways it is being used. The AI Engineering Podcast is your guide to the fast-moving world of building AI systems.Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.If you've learned something or tried out a project from the show then tell us about it! Email [email protected] with your story.Links LakeKeeperSAPMicrosoft AccessMicrosoft ExcelApache IcebergPodcast EpisodeIceberg REST CatalogPyIcebergSparkTrinoDremioHive MetastoreHadoopNATSPolarsDuckDBPodcast EpisodeDataFusionAtlanPodcast EpisodeOpen MetadataPodcast EpisodeApache AtlasOpenFGAHudiPodcast EpisodeDelta LakePodcast EpisodeLance Table FormatPodcast EpisodeUnity CatalogPolaris CatalogApache GravitinoPodcast Episode KeycloakOpen Policy Agent (OPA)Apache RangerApache NiFiThe intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA |
Data Engineering Podcast |